production SAL

2151-2200 of 10000 results (71ms)

2023-01-19 §
21:31	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2039.codfw.wmnet with reason: host reimage	[production]
21:31	<jdrewniak@deploy1002>	Started scap: Backport for [[gerrit:881677\|Enable Page tools on viwiki and itwiki (T327348)]]	[production]
21:27	<jdrewniak@deploy1002>	Finished scap: Backport for [[gerrit:881612\|Fix grid blowout with limited width turned off (T327423)]] (duration: 08m 26s)	[production]
21:27	<jiji@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mc2039.codfw.wmnet with reason: host reimage	[production]
21:20	<cwhite@deploy1002>	Finished deploy [releng/phatality@e0bb573]: (no justification provided) (duration: 00m 13s)	[production]
21:20	<cwhite@deploy1002>	Started deploy [releng/phatality@e0bb573]: (no justification provided)	[production]
21:20	<jdrewniak@deploy1002>	jdlrobson and jdrewniak: Backport for [[gerrit:881612\|Fix grid blowout with limited width turned off (T327423)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet	[production]
21:18	<jdrewniak@deploy1002>	Started scap: Backport for [[gerrit:881612\|Fix grid blowout with limited width turned off (T327423)]]	[production]
21:11	<jiji@cumin1001>	START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS bullseye	[production]
20:13	<zabe@deploy1002>	Finished scap: fix k8s drift (duration: 08m 02s)	[production]
20:05	<zabe@deploy1002>	Started scap: fix k8s drift	[production]
20:02	<zabe@deploy1002>	Finished scap: Backport for [[gerrit:881706\|Start reading from cuc_comment_id everywhere except wikidatawiki (T233004)]] (duration: 14m 01s)	[production]
19:49	<zabe@deploy1002>	zabe: Backport for [[gerrit:881706\|Start reading from cuc_comment_id everywhere except wikidatawiki (T233004)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet	[production]
19:48	<zabe@deploy1002>	Started scap: Backport for [[gerrit:881706\|Start reading from cuc_comment_id everywhere except wikidatawiki (T233004)]]	[production]
18:36	<zabe>	re-start populateCucComment on wikidatawiki post-mwmaint-reboot in screen with --sleep 2, will take ~30 hours # T233004	[production]
18:17	<bd808@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply	[production]
18:17	<bd808@deploy1002>	helmfile [eqiad] START helmfile.d/services/developer-portal: apply	[production]
18:16	<bd808@deploy1002>	helmfile [codfw] DONE helmfile.d/services/developer-portal: apply	[production]
18:16	<bd808@deploy1002>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
18:13	<bd808@deploy1002>	helmfile [staging] DONE helmfile.d/services/developer-portal: apply	[production]
18:12	<bd808@deploy1002>	helmfile [staging] START helmfile.d/services/developer-portal: apply	[production]
18:08	<mbsantos@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply	[production]
18:08	<mbsantos@deploy1002>	helmfile [eqiad] START helmfile.d/services/mobileapps: apply	[production]
18:06	<mbsantos@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mobileapps: apply	[production]
18:05	<mbsantos@deploy1002>	helmfile [codfw] START helmfile.d/services/mobileapps: apply	[production]
18:02	<mbsantos@deploy1002>	helmfile [staging] DONE helmfile.d/services/mobileapps: apply	[production]
18:01	<mbsantos@deploy1002>	helmfile [staging] START helmfile.d/services/mobileapps: apply	[production]
17:36	<Amir1>	bash Krinkle> Vatican Interm Papacy Runbook, § 5.1: Notify Wikipedia about incoming traffic.	[production]
17:17	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS bullseye	[production]
17:13	<zabe@deploy1002>	Finished scap: T233004 (duration: 18m 50s)	[production]
17:02	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2038.codfw.wmnet with reason: host reimage	[production]
16:58	<jiji@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mc2038.codfw.wmnet with reason: host reimage	[production]
16:54	<zabe@deploy1002>	Started scap: T233004	[production]
16:54	<zabe@deploy1002>	backport aborted: (duration: 15m 22s)	[production]
16:48	<godog>	roll-restart opensearch-dashboards in logstash collectors eqiad - T327161	[production]
16:44	<zabe@deploy1002>	Started scap: Backport for [[gerrit:881609\|Add ability to start from cuc_id to populateCucComment (T233004)]]	[production]
16:42	<jiji@cumin1001>	START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS bullseye	[production]
16:27	<moritzm>	installing cryptsetup updates for bullseye	[production]
16:18	<jmm@cumin2002>	END (FAIL) - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors (exit_code=1) rolling restart_daemons on A:logstash-collector	[production]
16:13	<jclark@cumin1001>	END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=False) upgrade firmware for hosts ['druid1009']	[production]
16:11	<jclark@cumin1001>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['druid1009']	[production]
16:09	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid1009.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:08	<jmm@cumin2002>	START - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors rolling restart_daemons on A:logstash-collector	[production]
16:06	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host druid1009.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:55	<sukhe>	update pybal to 1.15.10 on lvs4010: T321191	[production]
15:45	<effie>	enable puppet on C:memcached hosts	[production]
15:42	<godog>	bounce opensearch on logstash102[34] - T327161	[production]
15:30	<sukhe>	reprepro -C main include buster-wikimedia pybal_1.15.10_amd64.changes: T321191	[production]
15:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db2118 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P43194 and previous config saved to /var/cache/conftool/dbconfig/20230119-151917-ladsgroup.json	[production]
15:17	<effie>	disable puppet on all C:memcached servers to deploy 812173	[production]