production SAL

2651-2700 of 10000 results (47ms)

2022-03-25 §
08:24	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance	[production]
08:04	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1174 (T302658)', diff saved to https://phabricator.wikimedia.org/P23065 and previous config saved to /var/cache/conftool/dbconfig/20220325-080403-marostegui.json	[production]
08:04	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance	[production]
08:04	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance	[production]
08:03	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1127 (T302658)', diff saved to https://phabricator.wikimedia.org/P23064 and previous config saved to /var/cache/conftool/dbconfig/20220325-080355-marostegui.json	[production]
08:02	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance	[production]
08:02	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance	[production]
07:56	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance	[production]
07:56	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance	[production]
07:56	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance	[production]
07:56	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance	[production]
07:56	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23063 and previous config saved to /var/cache/conftool/dbconfig/20220325-075610-ladsgroup.json	[production]
07:48	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P23062 and previous config saved to /var/cache/conftool/dbconfig/20220325-074850-marostegui.json	[production]
07:41	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P23061 and previous config saved to /var/cache/conftool/dbconfig/20220325-074105-ladsgroup.json	[production]
07:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P23060 and previous config saved to /var/cache/conftool/dbconfig/20220325-073345-marostegui.json	[production]
07:26	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P23059 and previous config saved to /var/cache/conftool/dbconfig/20220325-072559-ladsgroup.json	[production]
07:18	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1127 (T302658)', diff saved to https://phabricator.wikimedia.org/P23058 and previous config saved to /var/cache/conftool/dbconfig/20220325-071840-marostegui.json	[production]
07:10	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23057 and previous config saved to /var/cache/conftool/dbconfig/20220325-071054-ladsgroup.json	[production]
06:41	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23056 and previous config saved to /var/cache/conftool/dbconfig/20220325-064139-ladsgroup.json	[production]
06:41	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance	[production]
06:41	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance	[production]
06:31	<_joe_>	deleting a couple zotero pods with excessive number of restarts	[production]
06:29	<marostegui>	dbmaint s4@eqiad T300775	[production]
06:07	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1112 for schema change', diff saved to https://phabricator.wikimedia.org/P23055 and previous config saved to /var/cache/conftool/dbconfig/20220325-060723-marostegui.json	[production]
05:47	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1127 (T302658)', diff saved to https://phabricator.wikimedia.org/P23054 and previous config saved to /var/cache/conftool/dbconfig/20220325-054705-marostegui.json	[production]
05:47	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1127.eqiad.wmnet with reason: Maintenance	[production]
05:46	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on db1127.eqiad.wmnet with reason: Maintenance	[production]
05:30	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1134 for testing', diff saved to https://phabricator.wikimedia.org/P23053 and previous config saved to /var/cache/conftool/dbconfig/20220325-053037-marostegui.json	[production]
00:39	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2027.codfw.wmnet with OS buster	[production]
2022-03-24 §
23:57	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host restbase2027.codfw.wmnet with OS buster	[production]
23:04	<pt1979@cumin2002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase2027.mgmt.codfw.wmnet with reboot policy FORCED	[production]
22:30	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T302658)', diff saved to https://phabricator.wikimedia.org/P23050 and previous config saved to /var/cache/conftool/dbconfig/20220324-223031-marostegui.json	[production]
22:19	<pt1979@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1047.eqiad.wmnet with OS bullseye	[production]
22:15	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P23049 and previous config saved to /var/cache/conftool/dbconfig/20220324-221526-marostegui.json	[production]
22:14	<pt1979@cumin2002>	START - Cookbook sre.hosts.provision for host restbase2027.mgmt.codfw.wmnet with reboot policy FORCED	[production]
22:10	<pt1979@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1047.eqiad.wmnet with reason: host reimage	[production]
22:07	<ebernhardson>	restart wcqs-blazegraph on wcqs2001 to resolve intermittant BlazegraphFreeAllocatorsDecreasingRapidly	[production]
22:06	<pt1979@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1047.eqiad.wmnet with reason: host reimage	[production]
22:00	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P23048 and previous config saved to /var/cache/conftool/dbconfig/20220324-220021-marostegui.json	[production]
21:54	<pt1979@cumin1001>	START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS bullseye	[production]
21:45	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T302658)', diff saved to https://phabricator.wikimedia.org/P23047 and previous config saved to /var/cache/conftool/dbconfig/20220324-214515-marostegui.json	[production]
21:42	<pt1979@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1047.eqiad.wmnet with OS bullseye	[production]
21:38	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
21:33	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
21:13	<pt1979@cumin1001>	START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS bullseye	[production]
21:13	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
21:12	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
21:12	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
21:11	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
21:11	<inflatador>	bking@cumin1001 restarting blazegraph on wdqs[1003-1013].eqiad.wmnet for T293862	[production]