production SAL

2501-2550 of 10000 results (33ms)

2021-10-20 §
13:24	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
13:21	<hashar@deploy1002>	Synchronized php: group1 wikis to 1.38.0-wmf.5 refs T281169 (duration: 01m 02s)	[production]
13:20	<hashar@deploy1002>	rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.5 refs T281169	[production]
13:11	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 T277116	[production]
13:11	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 T277116	[production]
13:04	<ema@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=ats-tls	[production]
13:04	<ema@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=varnish-fe	[production]
12:51	<ema>	cp3062: bump vsl_space from 80M (default) to 512M T293879 - varnish restart needed	[production]
12:37	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 14 hosts with reason: Schema change s1 T277116	[production]
12:36	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 3:00:00 on 14 hosts with reason: Schema change s1 T277116	[production]
12:17	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
12:09	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
12:02	<urbanecm@deploy1002>	Finished scap: 802d3b7: e4f7f85: CreateAccountCampaign: Support for recurring donors (T293699) (duration: 25m 19s)	[production]
11:57	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
11:49	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
11:46	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2007.codfw.wmnet	[production]
11:40	<jmm@cumin2002>	START - Cookbook sre.hosts.decommission for hosts testvm2007.codfw.wmnet	[production]
11:37	<btullis@cumin1001>	END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001	[production]
11:37	<urbanecm@deploy1002>	Started scap: 802d3b7: e4f7f85: CreateAccountCampaign: Support for recurring donors (T293699)	[production]
11:32	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2005.codfw.wmnet	[production]
11:21	<moritzm>	installing ffmpeg security updates	[production]
11:15	<urbanecm@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: e520fc57411bb19123766192cd636396ea6fc59d: GrowthExperiments: Add campaign pattern for enwiki (T293699) (duration: 01m 22s)	[production]
11:11	<btullis@cumin1001>	START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001	[production]
11:10	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
11:07	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
10:57	<jmm@cumin2002>	START - Cookbook sre.hosts.decommission for hosts testvm2005.codfw.wmnet	[production]
10:13	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 T277116	[production]
10:13	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 T277116	[production]
09:59	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 T277116	[production]
09:59	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 T277116	[production]
09:52	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 T277116	[production]
09:52	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 T277116	[production]
09:05	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 T277116	[production]
09:04	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 T277116	[production]
08:50	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 T277116	[production]
08:50	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 T277116	[production]
08:01	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
08:01	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
07:16	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1118.eqiad.wmnet with OS buster	[production]
07:09	<oblivian@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
06:49	<marostegui@cumin1001>	START - Cookbook sre.hosts.reimage for host db1118.eqiad.wmnet with OS buster	[production]
06:45	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1118 (s1) for reimage T290865', diff saved to https://phabricator.wikimedia.org/P17552 and previous config saved to /var/cache/conftool/dbconfig/20211020-064529-marostegui.json	[production]
06:41	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1126.eqiad.wmnet with OS buster	[production]
06:39	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db1106 (s1) after upgrade', diff saved to https://phabricator.wikimedia.org/P17551 and previous config saved to /var/cache/conftool/dbconfig/20211020-063926-marostegui.json	[production]
06:35	<marostegui>	Upgrade db1106	[production]
06:34	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1106 (s1) for upgrade', diff saved to https://phabricator.wikimedia.org/P17550 and previous config saved to /var/cache/conftool/dbconfig/20211020-063431-marostegui.json	[production]
06:31	<dcausse>	restarting blazegraph on wdqs1012	[production]
06:28	<elukey>	reboot analytics1066 - OS showing CPU soft lockups, tons of defunct processes (including node manager) and high CPU usage	[production]
06:21	<marostegui>	Depool clouddb1013 for upgrade	[production]
06:14	<marostegui@cumin1001>	START - Cookbook sre.hosts.reimage for host db1126.eqiad.wmnet with OS buster	[production]