__all__ SAL

1601-1650 of 10000 results (34ms)

2024-02-07 §
18:30	<btullis@cumin1002>	START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.	[production]
18:25	<btullis@cumin1002>	END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad	[production]
18:23	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P56463 and previous config saved to /var/cache/conftool/dbconfig/20240207-182342-marostegui.json	[production]
18:08	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P56462 and previous config saved to /var/cache/conftool/dbconfig/20240207-180835-marostegui.json	[production]
18:00	<andrew@cloudcumin1001>	START - Cookbook wmcs.toolforge.k8s.reboot for all workers	[tools]
17:59	<wmbot~lucaswerkmeister@tools-sgebastion-10>	started webservice again (and patched the startup probe into it); took a while to come up but now it seems to be working	[tools.lexeme-forms]
17:58	<andrew@cloudcumin1001>	END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-9	[tools]
17:58	<andrew@cloudcumin1001>	START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-9	[tools]
17:53	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1191 (T355609)', diff saved to https://phabricator.wikimedia.org/P56461 and previous config saved to /var/cache/conftool/dbconfig/20240207-175328-marostegui.json	[production]
17:52	<bking@cumin2002>	END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw	[production]
17:52	<bking@cumin2002>	START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw	[production]
17:49	<wmbot~lucaswerkmeister@tools-sgebastion-10>	stopped webservice, restart wasn’t working so let’s try harder	[tools.lexeme-forms]
17:48	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db1191 (T355609)', diff saved to https://phabricator.wikimedia.org/P56460 and previous config saved to /var/cache/conftool/dbconfig/20240207-174807-marostegui.json	[production]
17:48	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance	[production]
17:47	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance	[production]
17:47	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1174 (T355609)', diff saved to https://phabricator.wikimedia.org/P56459 and previous config saved to /var/cache/conftool/dbconfig/20240207-174745-marostegui.json	[production]
17:45	<wmbot~lucaswerkmeister@tools-sgebastion-10>	restarted webservice, log was full of various errors	[tools.lexeme-forms]
17:32	<jgiannelos@deploy2002>	Finished deploy [restbase/deploy@1007273]: Disabling storage for jawiki (duration: 07m 19s)	[production]
17:32	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P56458 and previous config saved to /var/cache/conftool/dbconfig/20240207-173238-marostegui.json	[production]
17:26	<btullis>	roll-restarting kafka-jumbo for T356382	[analytics]
17:26	<btullis@cumin1002>	START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad	[production]
17:25	<jgiannelos@deploy2002>	Started deploy [restbase/deploy@1007273]: Disabling storage for jawiki	[production]
17:24	<taavi@cloudcumin1001>	END (FAIL) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=99) for all workers	[tools]
17:23	<taavi@cloudcumin1001>	START - Cookbook wmcs.toolforge.k8s.reboot for all workers	[tools]
17:17	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P56457 and previous config saved to /var/cache/conftool/dbconfig/20240207-171732-marostegui.json	[production]
17:11	<hnowlan@puppetmaster1001>	conftool action : set/pooled=yes:weight=10; selector: service=thumbor	[production]
17:05	<taavi@cloudcumin1001>	END (FAIL) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=99) for all workers	[tools]
17:05	<taavi@cloudcumin1001>	START - Cookbook wmcs.toolforge.k8s.reboot for all workers	[tools]
17:04	<sbailey@deploy2002>	helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply	[production]
17:04	<sbailey@deploy2002>	helmfile [codfw] START helmfile.d/services/wikifeeds: apply	[production]
17:03	<sbailey@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply	[production]
17:03	<taavi@cloudcumin1001>	END (ERROR) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=97) for all workers	[tools]
17:03	<sbailey@deploy2002>	helmfile [eqiad] START helmfile.d/services/wikifeeds: apply	[production]
17:02	<taavi@cloudcumin1001>	START - Cookbook wmcs.toolforge.k8s.reboot for all workers	[tools]
17:02	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1174 (T355609)', diff saved to https://phabricator.wikimedia.org/P56456 and previous config saved to /var/cache/conftool/dbconfig/20240207-170225-marostegui.json	[production]
17:01	<taavi@cloudcumin1001>	END (FAIL) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=99) for all workers	[tools]
16:57	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db1174 (T355609)', diff saved to https://phabricator.wikimedia.org/P56455 and previous config saved to /var/cache/conftool/dbconfig/20240207-165703-marostegui.json	[production]
16:56	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance	[production]
16:56	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance	[production]
16:55	<sbailey@deploy2002>	helmfile [staging] DONE helmfile.d/services/wikifeeds: apply	[production]
16:54	<sbailey@deploy2002>	helmfile [staging] START helmfile.d/services/wikifeeds: apply	[production]
16:52	<hnowlan@cumin2002>	conftool action : set/pooled=yes; selector: name=(mw2377.codfw.wmnet\|mw2378.codfw.wmnet\|mw2406.codfw.wmnet\|mw2301.codfw.wmnet\|mw2310.codfw.wmnet),cluster=kubernetes,service=kubesvc	[production]
16:52	<hnowlan@cumin2002>	conftool action : set/weight=10; selector: name=(mw2377.codfw.wmnet\|mw2378.codfw.wmnet\|mw2406.codfw.wmnet\|mw2301.codfw.wmnet\|mw2310.codfw.wmnet),cluster=kubernetes,service=kubesvc	[production]
16:47	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance	[production]
16:47	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance	[production]
16:47	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T355609)', diff saved to https://phabricator.wikimedia.org/P56454 and previous config saved to /var/cache/conftool/dbconfig/20240207-164738-marostegui.json	[production]
16:47	<cmooney@cumin1002>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw-a-codfw,cr[1-2]-codfw,lsw1-a2-codfw.mgmt	[production]
16:47	<cmooney@cumin1002>	START - Cookbook sre.hosts.remove-downtime for asw-a-codfw,cr[1-2]-codfw,lsw1-a2-codfw.mgmt	[production]
16:47	<ejegg>	fundraising civicrm upgraded from c3dff157 to 98d35c79	[production]
16:46	<hnowlan>	homer 'crcodfw' commit 'T354791' for 5 new k8s ex-appservers	[production]