production SAL

1-50 of 10000 results (56ms)

2020-02-07 §
22:20	<jeh>	ceph: round 2 OSD failover and recovery testing on cloudcephosd1003.wikimedia.org T240718	[production]
20:47	<mutante>	OS install on new install_server VMs worked on second attempt, issues are gone. signed puppet certs for install1003.eqiad.wmnet, install2003.codfw.wmnet, initial puppet runs (T224576)	[production]
20:42	<jeh>	ceph: OSD failover and recovery testing on cloudcephosd1003.wikimedia.org T240718	[production]
20:32	<mutante>	ganeti: attempting to reinstall install1003 which failed last time	[production]
17:38	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool es1019 after on-site maintenance T243963', diff saved to https://phabricator.wikimedia.org/P10350 and previous config saved to /var/cache/conftool/dbconfig/20200207-173850-marostegui.json	[production]
17:36	<twentyafterfour@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: sync InitializeSettings again for lols refs T233866 (duration: 01m 03s)	[production]
17:32	<twentyafterfour@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: sync https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/570929 refs T233866 (duration: 01m 02s)	[production]
17:25	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool es1019 after on-site maintenance T243963', diff saved to https://phabricator.wikimedia.org/P10349 and previous config saved to /var/cache/conftool/dbconfig/20200207-172541-marostegui.json	[production]
17:22	<twentyafterfour@deploy1001>	rebuilt and synchronized wikiversions files: roll back all wikis to 1.35.0-wmf.16 refs T233866	[production]
17:19	<marostegui>	Start MySQL on es1019 after onsite maintenance T243963	[production]
16:46	<filippo@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)	[production]
16:38	<filippo@cumin1001>	START - Cookbook sre.ganeti.makevm	[production]
16:13	<XioNoX>	remove MSS clamping from eqiad/eqord/knams/esams	[production]
16:05	<andrew@deploy1001>	Finished deploy [horizon/deploy@bc777d6]: Fix for T243422 (duration: 03m 45s)	[production]
16:04	<vgutierrez>	pooling cp4030 with buster - T242093	[production]
16:03	<bblack>	removing GRE MTU mitigations from cp[135]xxx - T232602	[production]
16:01	<andrew@deploy1001>	Started deploy [horizon/deploy@bc777d6]: Fix for T243422	[production]
15:50	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
15:48	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
15:25	<vgutierrez>	depool & reimage cp4030 as buster - T242093	[production]
15:21	<vgutierrez>	pooling cp4031 with buster - T242093	[production]
15:20	<vgutierrez>	pooling ncredir3001 running buster - T243391	[production]
15:18	<marostegui>	Restart all instances on db1124 and db1125 to pick up a new replication filter - T240094	[production]
15:11	<marostegui>	Restart all instances on db2094 and db2095 to pick up a new replication filter - T240094	[production]
14:56	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:53	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:43	<hoo@deploy1001>	Synchronized wmf-config/Wikibase.php: REVERT: Wikibase Client: Fix setting name typo (T244529) (duration: 01m 40s)	[production]
14:43	<Amir1>	ladsgroup@mwmaint1002:~$ mwscript createAndPromote.php --wiki=zhwiki --force "Amir Sarabadani (WMDE)" --sysop (T244578)	[production]
14:40	<hoo@deploy1001>	Scap failed!: 9/11 canaries failed their endpoint checks(http://en.wikipedia.org)	[production]
14:38	<hoo@deploy1001>	Synchronized wmf-config/Wikibase.php: Wikibase Client: Fix setting name typo (T244529) (duration: 01m 20s)	[production]
14:33	<vgutierrez>	depool and reimage ncredir3001 as buster - T243391	[production]
14:32	<vgutierrez>	depool & reimage cp4031 as buster - T242093	[production]
14:23	<vgutierrez>	pooling ncredir3002 running buster - T243391	[production]
13:26	<vgutierrez>	pooling cp4021 with buster - T242093	[production]
13:05	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
13:03	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
12:51	<vgutierrez>	depool and reimage ncredir3002 as buster - T243391	[production]
12:42	<vgutierrez>	depool & reimage cp4021 as buster - T242093	[production]
12:08	<akosiaris@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
12:08	<akosiaris@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
11:58	<akosiaris@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
11:57	<akosiaris@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
11:25	<vgutierrez>	pooling ncredir5001 running buster - T243391	[production]
11:24	<vgutierrez>	pooling cp4022 with buster - T242093	[production]
11:09	<akosiaris>	undo wikifeeds experiments	[production]
11:07	<akosiaris@deploy1001>	helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .	[production]
10:42	<akosiaris@deploy1001>	helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .	[production]
10:40	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
10:37	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
10:36	<akosiaris>	conduct experiments with stopping/starting uwsgi-ores on ores2001 T242705	[production]