production SAL

6751-6800 of 10000 results (93ms)

2023-08-31 §
12:09	<cmooney@cumin1001>	END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device lsw1-a1-codfw.mgmt.codfw.wmnet	[production]
12:09	<cmooney@cumin1001>	START - Cookbook sre.network.provision for device lsw1-a1-codfw.mgmt.codfw.wmnet	[production]
12:05	<ladsgroup@deploy1002>	isaranto and ladsgroup: Backport for [[gerrit:953973\|ores-extension: enable lift wing for fiwiki and itwiki (T343308)]] synced to the testservers mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option)	[production]
12:03	<ladsgroup@deploy1002>	Started scap: Backport for [[gerrit:953973\|ores-extension: enable lift wing for fiwiki and itwiki (T343308)]]	[production]
12:03	<aqu@deploy1002>	Started deploy [analytics/refinery@06203c0]: Regular analytics weekly train [analytics/refinery@06203c0]	[production]
12:02	<aqu>	About to deploy analytics refinery (weekly train)	[production]
12:01	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P52194 and previous config saved to /var/cache/conftool/dbconfig/20230831-120148-ladsgroup.json	[production]
11:59	<jayme@cumin1001>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host kubemaster1002.eqiad.wmnet	[production]
11:54	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P52193 and previous config saved to /var/cache/conftool/dbconfig/20230831-115429-ladsgroup.json	[production]
11:48	<cmooney@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
11:47	<jayme@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kubemaster1002.eqiad.wmnet	[production]
11:46	<marostegui@cumin1001>	START - Cookbook sre.mysql.clone of db1132.eqiad.wmnet onto db1119.eqiad.wmnet	[production]
11:46	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P52192 and previous config saved to /var/cache/conftool/dbconfig/20230831-114642-ladsgroup.json	[production]
11:46	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
11:46	<cmooney@cumin1001>	START - Cookbook sre.network.provision for device ssw1-a8-codfw.mgmt.codfw.wmnet	[production]
11:44	<jayme@cumin1001>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host kubemaster1001.eqiad.wmnet	[production]
11:40	<cgoubert@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply	[production]
11:40	<cgoubert@deploy1002>	helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply	[production]
11:39	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1201 (T343718)', diff saved to https://phabricator.wikimedia.org/P52191 and previous config saved to /var/cache/conftool/dbconfig/20230831-113922-ladsgroup.json	[production]
11:36	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1201 (T343718)', diff saved to https://phabricator.wikimedia.org/P52190 and previous config saved to /var/cache/conftool/dbconfig/20230831-113613-ladsgroup.json	[production]
11:36	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance	[production]
11:36	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance	[production]
11:36	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1187 (T343718)', diff saved to https://phabricator.wikimedia.org/P52189 and previous config saved to /var/cache/conftool/dbconfig/20230831-113603-ladsgroup.json	[production]
11:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1132', diff saved to https://phabricator.wikimedia.org/P52187 and previous config saved to /var/cache/conftool/dbconfig/20230831-113324-root.json	[production]
11:32	<jayme@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kubemaster1001.eqiad.wmnet	[production]
11:31	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2151 (T343718)', diff saved to https://phabricator.wikimedia.org/P52186 and previous config saved to /var/cache/conftool/dbconfig/20230831-113136-ladsgroup.json	[production]
11:31	<jayme@cumin1001>	END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-worker-eqiad	[production]
11:20	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P52185 and previous config saved to /var/cache/conftool/dbconfig/20230831-112057-ladsgroup.json	[production]
11:13	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db2151 (T343718)', diff saved to https://phabricator.wikimedia.org/P52184 and previous config saved to /var/cache/conftool/dbconfig/20230831-111353-ladsgroup.json	[production]
11:13	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance	[production]
11:13	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance	[production]
11:13	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2129 (T343718)', diff saved to https://phabricator.wikimedia.org/P52183 and previous config saved to /var/cache/conftool/dbconfig/20230831-111332-ladsgroup.json	[production]
11:08	<cmooney@cumin1001>	END (PASS) - Cookbook sre.network.provision (exit_code=0) for device ssw1-a1-codfw.mgmt.codfw.wmnet	[production]
11:07	<jayme@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kubernetes1025.eqiad.wmnet	[production]
11:06	<jayme@cumin1001>	START - Cookbook sre.hosts.remove-downtime for kubernetes1025.eqiad.wmnet	[production]
11:05	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P52182 and previous config saved to /var/cache/conftool/dbconfig/20230831-110551-ladsgroup.json	[production]
10:58	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P52181 and previous config saved to /var/cache/conftool/dbconfig/20230831-105826-ladsgroup.json	[production]
10:54	<ariel@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1006.eqiad.wmnet	[production]
10:54	<cgoubert@cumin1001>	START - Cookbook sre.hosts.reboot-cluster	[production]
10:54	<cgoubert@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)	[production]
10:53	<cmooney@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
10:53	<cmooney@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - cmooney@cumin1001"	[production]
10:52	<cmooney@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - cmooney@cumin1001"	[production]
10:51	<hnowlan@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply	[production]
10:50	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
10:50	<cmooney@cumin1001>	START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet	[production]
10:50	<hnowlan@deploy1002>	helmfile [eqiad] START helmfile.d/services/device-analytics: apply	[production]
10:50	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1187 (T343718)', diff saved to https://phabricator.wikimedia.org/P52180 and previous config saved to /var/cache/conftool/dbconfig/20230831-105044-ladsgroup.json	[production]
10:50	<hnowlan@deploy1002>	helmfile [codfw] DONE helmfile.d/services/device-analytics: apply	[production]
10:50	<moritzm>	installing flask security updates on buster	[production]