production SAL

2251-2300 of 10000 results (128ms)

2024-07-12 §
15:25	<swfrench@deploy1002>	helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply	[production]
15:23	<cgoubert@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1349.eqiad.wmnet with reason: host reimage	[production]
15:21	<cgoubert@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1351.eqiad.wmnet with reason: host reimage	[production]
15:21	<cgoubert@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1350.eqiad.wmnet with reason: host reimage	[production]
15:21	<cgoubert@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1349.eqiad.wmnet with reason: host reimage	[production]
15:20	<swfrench@deploy1002>	helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply	[production]
15:20	<swfrench@deploy1002>	helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply	[production]
15:19	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P66405 and previous config saved to /var/cache/conftool/dbconfig/20240712-151907-marostegui.json	[production]
15:17	<hnowlan@deploy1002>	helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
15:17	<hnowlan@deploy1002>	helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply	[production]
15:17	<hnowlan@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
15:17	<hnowlan@deploy1002>	helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply	[production]
15:15	<hnowlan>	homer 'creqiad' commit 'videoscaler reimages mw1349/mw135[01]'	[production]
15:08	<dcausse@deploy1002>	helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply	[production]
15:07	<dcausse@deploy1002>	helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply	[production]
15:07	<cgoubert@cumin1002>	START - Cookbook sre.hosts.reimage for host mw1351.eqiad.wmnet with OS buster	[production]
15:06	<cgoubert@cumin1002>	START - Cookbook sre.hosts.reimage for host mw1350.eqiad.wmnet with OS buster	[production]
15:06	<cgoubert@cumin1002>	START - Cookbook sre.hosts.reimage for host mw1349.eqiad.wmnet with OS buster	[production]
15:04	<dcausse@deploy1002>	helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply	[production]
15:04	<dcausse@deploy1002>	helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply	[production]
15:04	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2218 (T367856)', diff saved to https://phabricator.wikimedia.org/P66404 and previous config saved to /var/cache/conftool/dbconfig/20240712-150400-marostegui.json	[production]
15:03	<dcausse@deploy1002>	helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply	[production]
15:02	<dcausse@deploy1002>	helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply	[production]
14:58	<cgoubert@cumin1002>	conftool action : set/pooled=inactive; selector: name=(mw1349.eqiad.wmnet\|mw1350.eqiad.wmnet\|mw1351.eqiad.wmnet),cluster=kubernetes,service=kubesvc	[production]
14:55	<claime>	Draining and depooling mw1349, mw1350, mw1351 for reimage as jobrunners	[production]
14:36	<elukey@cumin1002>	END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-d3-codfw	[production]
14:34	<elukey@cumin1002>	START - Cookbook sre.network.tls for network device lsw1-d3-codfw	[production]
14:20	<hnowlan@deploy1002>	helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
14:19	<hnowlan@deploy1002>	helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply	[production]
14:19	<hnowlan@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
14:18	<hnowlan@deploy1002>	helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply	[production]
13:45	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
13:43	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
13:22	<cdanis@deploy1002>	helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply	[production]
13:21	<cdanis@deploy1002>	helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply	[production]
13:21	<cdanis@deploy1002>	helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply	[production]
13:21	<cdanis@deploy1002>	helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply	[production]
13:19	<cdanis@deploy1002>	helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply	[production]
13:18	<cdanis@deploy1002>	helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply	[production]
13:18	<cdanis@deploy1002>	helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply	[production]
13:12	<cdanis@deploy1002>	helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply	[production]
13:10	<topranks>	pushing updated BGP policy to cr2-eqord and cr2-eqdfw to announce Anycast ranges from network pops (T367439)	[production]
10:24	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1196 (re)pooling @ 100%: stopping T367781', diff saved to https://phabricator.wikimedia.org/P66396 and previous config saved to /var/cache/conftool/dbconfig/20240712-102416-arnaudb.json	[production]
10:22	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db1198 (T367856)', diff saved to https://phabricator.wikimedia.org/P66395 and previous config saved to /var/cache/conftool/dbconfig/20240712-102243-marostegui.json	[production]
10:22	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance	[production]
10:22	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance	[production]
10:22	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1189 (T367856)', diff saved to https://phabricator.wikimedia.org/P66394 and previous config saved to /var/cache/conftool/dbconfig/20240712-102221-marostegui.json	[production]
10:18	<godog>	stop benthos@webrequest_live on centrallog2002 and start it on centrallog1002 - T369737	[production]
10:09	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1196 (re)pooling @ 75%: stopping T367781', diff saved to https://phabricator.wikimedia.org/P66393 and previous config saved to /var/cache/conftool/dbconfig/20240712-100910-arnaudb.json	[production]
10:07	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P66392 and previous config saved to /var/cache/conftool/dbconfig/20240712-100714-marostegui.json	[production]