production SAL

3601-3650 of 10000 results (75ms)

2023-05-23 §
18:26	<ryankemper>	[WDQS] T337327 Deployed new, hopefully-working rule after addressing previous syntax error (unescaped `"`). See `/srv/private` commit `6e2f5ab19427902994bb9d03d28277252f021474`	[production]
18:16	<ryankemper>	[WDQS] Rolled back requestctl rule	[production]
18:12	<ryankemper>	[WDQS] T337327 New rule in place to ban potential source of WDQS codfw outage. Rolling restart will be done in a couple minutes to [attempt to] restore service availability	[production]
17:05	<otto@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply	[production]
17:05	<otto@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply	[production]
17:03	<sbassett>	Deployed updated security mitigation for T336027 and T333140	[production]
17:00	<akosiaris@cumin1001>	END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-eqiad cluster: Reboot kafka nodes	[production]
16:58	<otto@deploy1002>	helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply	[production]
16:58	<otto@deploy1002>	helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply	[production]
16:50	<sbassett>	Deployed updated security mitigation for T336027, part 2	[production]
16:50	<otto@deploy1002>	helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply	[production]
16:49	<otto@deploy1002>	helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply	[production]
16:43	<cmooney@cumin1001>	END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Homer Release v0.6.2 with updated wmf-plugin - cmooney@cumin1001	[production]
16:43	<otto@deploy1002>	helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply	[production]
16:43	<otto@deploy1002>	helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply	[production]
16:42	<sbassett>	Deployed updated security mitigation for T336027	[production]
16:41	<cmooney@cumin1001>	START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Homer Release v0.6.2 with updated wmf-plugin - cmooney@cumin1001	[production]
16:31	<otto@deploy1002>	Synchronized wmf-config/ext-EventStreamConfig.php: EventStreamConfig - Rename page content change enrich error stream to match convention - T336656 (duration: 06m 58s)	[production]
16:22	<sukhe@deploy1002>	Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937 (duration: 36m 02s)	[production]
15:56	<topranks>	moving lvs1018 connection to rack E1 from lsw1-e1-eqiad to ssw1-e1-eqiad T322937	[production]
15:46	<sukhe@deploy1002>	Locking from deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937	[production]
15:46	<otto@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply	[production]
15:45	<otto@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply	[production]
15:45	<sukhe>	stop pybal on lvs1018: T322937	[production]
15:38	<eoghan@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases2003.codfw.wmnet with OS bullseye	[production]
15:30	<jclark@cumin1001>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1150.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:24	<eoghan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage	[production]
15:22	<jayme@deploy1002>	helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
15:22	<jayme@deploy1002>	helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
15:22	<jayme@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
15:21	<jayme@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
15:21	<jayme@deploy1002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.	[production]
15:21	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host an-worker1150.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:21	<jayme@deploy1002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.	[production]
15:21	<jayme@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
15:21	<jayme@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
15:20	<jclark@cumin1001>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1150.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:20	<eoghan@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage	[production]
15:20	<jayme@deploy1002>	helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
15:19	<jayme@deploy1002>	helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.	[production]
15:16	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host an-worker1150.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:16	<jclark@cumin1001>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1150.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:14	<otto@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply	[production]
15:14	<otto@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply	[production]
15:03	<eoghan@cumin1001>	START - Cookbook sre.hosts.reimage for host releases2003.codfw.wmnet with OS bullseye	[production]
15:02	<eoghan@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS bullseye	[production]
15:00	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host an-worker1150.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
15:00	<akosiaris@cumin1001>	START - Cookbook sre.kafka.reboot-workers for Kafka main-eqiad cluster: Reboot kafka nodes	[production]
14:58	<otto@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
14:58	<otto@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]