__all__ SAL

5901-5950 of 10000 results (42ms)

2021-03-18 §
07:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14944 and previous config saved to /var/cache/conftool/dbconfig/20210318-073250-root.json	[production]
07:20	<dcausse>	depooling & restarting blazegraph on wdqs1005	[production]
07:19	<marostegui>	Deploy schema change on s4 codfw master, lag will appear - T276150 T276156	[production]
07:17	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14943 and previous config saved to /var/cache/conftool/dbconfig/20210318-071747-root.json	[production]
07:15	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE	[production]
07:13	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE	[production]
06:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'Add db1161 to dbctl, depooled T258361', diff saved to https://phabricator.wikimedia.org/P14942 and previous config saved to /var/cache/conftool/dbconfig/20210318-063241-marostegui.json	[production]
06:32	<elukey>	force a manual run of create_virtualenv.sh on an-tool1010 - superset down	[analytics]
06:22	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db2120', diff saved to https://phabricator.wikimedia.org/P14941 and previous config saved to /var/cache/conftool/dbconfig/20210318-062201-marostegui.json	[production]
06:04	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1126 for schema change', diff saved to https://phabricator.wikimedia.org/P14940 and previous config saved to /var/cache/conftool/dbconfig/20210318-060445-marostegui.json	[production]
04:12	<bstorm>	rebooted tools-sgeexec-0935.tools.eqiad.wmflabs because it forgot how to LDAP...likely root cause of the issues tonight	[tools]
03:59	<bstorm>	rebooting grid master. sorry for the cron spam	[tools]
03:49	<bstorm>	restarting sssd on tools-sgegrid-master	[tools]
03:46	<andrewbogott>	restarting slapd on seaborgium, serpens, and r-o ldap replicas (we're getting irregular connection failures)	[production]
03:37	<bstorm>	deleted a massive number of stuck jobs that misfired from the cron server	[tools]
03:35	<bstorm>	rebooting tools-sgecron-01 to try to clear up the ldap-related errors coming out of it	[tools]
01:46	<bstorm>	killed the toolschecker cron job, which had an LDAP error, and ran it again by hand	[tools]
00:05	<eileen>	tools revision changed from b7b4060c30 to ef54260b0d	[production]
2021-03-17 §
23:42	<urbanecm@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: c730dd5feb865a8325279cd4e76c133512f14251: idwiki: Deploy Growth features to newcomers (T259024) (duration: 01m 08s)	[production]
23:40	<urbanecm@deploy1002>	Synchronized wmf-config/CommonSettings.php: 5c14e7d2045f0905f7e85b249e821bbe8d69c600: Define confirmed group in MediaWikiServices hook (T275334, T277704, T275310, T275333) (duration: 01m 08s)	[production]
23:30	<ebernhardson@deploy1002>	Synchronized php-1.36.0-wmf.35/extensions/CirrusSearch/profiles/FallbackProfiles.config.php: Add fallback profile including glent m1 (duration: 01m 42s)	[production]
22:27	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1038.eqiad.wmnet with reason: REIMAGE	[production]
22:25	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1037.eqiad.wmnet with reason: REIMAGE	[production]
22:25	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mc1038.eqiad.wmnet with reason: REIMAGE	[production]
22:23	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mc1037.eqiad.wmnet with reason: REIMAGE	[production]
20:57	<bstorm>	deployed changes to rbac for kubernetes to add kubectl top access for tools	[tools]
20:52	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: REIMAGE	[production]
20:50	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1183.eqiad.wmnet with reason: REIMAGE	[production]
20:48	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: REIMAGE	[production]
20:48	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: REIMAGE	[production]
20:47	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1183.eqiad.wmnet with reason: REIMAGE	[production]
20:46	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: REIMAGE	[production]
20:45	<razzi>	release wikistats 2.9.0	[analytics]
20:45	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: REIMAGE	[production]
20:44	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: REIMAGE	[production]
20:43	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: REIMAGE	[production]
20:42	<andrew@deploy1002>	Finished deploy [horizon/deploy@17ea780]: display volume usage summaries (duration: 03m 34s)	[production]
20:42	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: REIMAGE	[production]
20:41	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: REIMAGE	[production]
20:40	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: REIMAGE	[production]
20:39	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: REIMAGE	[production]
20:39	<andrew@deploy1002>	Started deploy [horizon/deploy@17ea780]: display volume usage summaries	[production]
20:38	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: REIMAGE	[production]
20:37	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: REIMAGE	[production]
20:35	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: REIMAGE	[production]
20:30	<hashar>	Reloaded Zuul for I2368478e4c4ab8752581f55a7c5ab493fafdeb41	[releng]
20:26	<andrewbogott>	moving tools-elastic-3 to cloudvirt1034; two elastic nodes shouldn't be on the same hv	[tools]
20:19	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2238.codfw.wmnet	[production]
20:15	<ottomata>	install anaconda-wmf 2020.02~wmf3 on analytics cluster clients and workers - T262847	[analytics]
20:08	<dzahn@cumin1001>	START - Cookbook sre.hosts.decommission for hosts mw2238.codfw.wmnet	[production]