__all__ SAL

7751-7800 of 10000 results (38ms)

2023-06-09 §
21:50	<jclark@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye	[production]
20:53	<jclark@cumin1001>	START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye	[production]
20:53	<jclark@cumin1001>	START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye	[production]
20:40	<btullis>	restarting the aqs service more quickly with: `sudo cumin -b 2 -s 10 A:aqs 'systemctl restart aqs'`	[analytics]
20:38	<btullis@cumin1001>	END (ERROR) - Cookbook sre.aqs.roll-restart-reboot (exit_code=97) rolling restart_daemons on A:aqs	[production]
20:23	<btullis>	btullis@cumin1001:~$ sudo cookbook sre.aqs.roll-restart-reboot --alias aqs restart_daemons --reason aqs_rollback_btullis	[analytics]
20:23	<btullis@cumin1001>	START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs	[production]
20:22	<btullis>	merged and deployed https://gerrit.wikimedia.org/r/c/operations/puppet/+/928927 to revert aqs mediawiki snapshot change	[analytics]
19:57	<andrewbogott>	rebooting tools-sgeweblight-10-18 to see if it helps with T338644	[tools]
19:38	<andrewbogott>	rebooting tools-sgeweblight-10-28 for T337806	[tools]
17:51	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye	[production]
17:47	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1016.eqiad.wmnet with OS buster	[production]
17:34	<jhancock@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
17:34	<jhancock@cumin2002>	START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
17:32	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49398 and previous config saved to /var/cache/conftool/dbconfig/20230609-173202-ladsgroup.json	[production]
17:16	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49397 and previous config saved to /var/cache/conftool/dbconfig/20230609-171656-ladsgroup.json	[production]
17:01	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49396 and previous config saved to /var/cache/conftool/dbconfig/20230609-170150-ladsgroup.json	[production]
16:54	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS buster	[production]
16:47	<wm-bot>	<lucaswerkmeister> deployed e059c8bbd6 (l10n updates: fi); also, last time I forgot to git rebase, so this actually includes 2035050d28 (l10n updates: sv) as well	[tools.lexeme-forms]
16:46	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49395 and previous config saved to /var/cache/conftool/dbconfig/20230609-164644-ladsgroup.json	[production]
16:30	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49394 and previous config saved to /var/cache/conftool/dbconfig/20230609-163007-ladsgroup.json	[production]
16:30	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance	[production]
16:29	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance	[production]
16:29	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49393 and previous config saved to /var/cache/conftool/dbconfig/20230609-162946-ladsgroup.json	[production]
16:20	<urandom>	powercycling restbase1028	[production]
16:14	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49392 and previous config saved to /var/cache/conftool/dbconfig/20230609-161440-ladsgroup.json	[production]
16:05	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:04	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['snapshot1016']	[production]
16:02	<pt1979@cumin2002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['snapshot1016']	[production]
15:59	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49391 and previous config saved to /var/cache/conftool/dbconfig/20230609-155934-ladsgroup.json	[production]
15:57	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:44	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49390 and previous config saved to /var/cache/conftool/dbconfig/20230609-154428-ladsgroup.json	[production]
15:40	<wm-bot>	<lucaswerkmeister> (restarts were by lucaswerkmeister and TheresNoTime ftr ^^)	[tools.sal]
15:39	<wm-bot>	<root> webservice restart	[tools.sal]
15:39	<wm-bot>	<root> restart to fix 5xx errors	[tools.sal]
15:30	<andrewbogott>	wikitech-static: deleted everything in /srv/mediawiki/images/wikitech/archive for T338520	[production]
15:28	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49388 and previous config saved to /var/cache/conftool/dbconfig/20230609-152845-ladsgroup.json	[production]
15:28	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance	[production]
15:28	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance	[production]
15:28	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2173 (T336886)', diff saved to https://phabricator.wikimedia.org/P49387 and previous config saved to /var/cache/conftool/dbconfig/20230609-152824-ladsgroup.json	[production]
15:27	<pt1979@cumin2002>	START - Cookbook sre.hosts.provision for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:27	<pt1979@cumin2002>	START - Cookbook sre.hosts.provision for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:23	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:23	<pt1979@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002"	[production]
15:22	<pt1979@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002"	[production]
15:21	<TheresNoTime>	deployment-prep: `[samtar@deployment-deploy03 ~]$ sudo -u jenkins-deploy scap prep auto --no-log-message`	[releng]
15:17	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
15:13	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49386 and previous config saved to /var/cache/conftool/dbconfig/20230609-151318-ladsgroup.json	[production]
15:11	<TheresNoTime>	deployment-prep: `[samtar@deployment-deploy03 ~]$ scap sync-world --no-log-message`	[releng]
14:58	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49385 and previous config saved to /var/cache/conftool/dbconfig/20230609-145812-ladsgroup.json	[production]