production SAL

8451-8500 of 10000 results (70ms)

2022-01-11 §
13:43	<btullis@cumin1001>	END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.	[production]
13:42	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18567 and previous config saved to /var/cache/conftool/dbconfig/20220111-134239-marostegui.json	[production]
13:36	<btullis@cumin1001>	START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.	[production]
13:36	<btullis@cumin1001>	END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.	[production]
13:33	<moritzm>	installing 4.9.290 kernels von stretch systems (no reboots yet)	[production]
13:29	<btullis@cumin1001>	START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.	[production]
13:27	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1147 (T297191)', diff saved to https://phabricator.wikimedia.org/P18565 and previous config saved to /var/cache/conftool/dbconfig/20220111-132734-marostegui.json	[production]
13:26	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1147 (T297191)', diff saved to https://phabricator.wikimedia.org/P18564 and previous config saved to /var/cache/conftool/dbconfig/20220111-132627-marostegui.json	[production]
13:26	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance	[production]
13:26	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance	[production]
13:26	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance	[production]
13:26	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance	[production]
13:26	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance	[production]
13:26	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance	[production]
13:26	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance	[production]
13:26	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance	[production]
13:12	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
13:11	<jmm@cumin2002>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM people1003.eqiad.wmnet	[production]
13:09	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
13:09	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
13:07	<jmm@cumin2002>	START - Cookbook sre.ganeti.reboot-vm for VM people1003.eqiad.wmnet	[production]
13:05	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
13:04	<jmm@cumin2002>	END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM planet1002.eqiad.wmnet	[production]
12:59	<jmm@cumin2002>	START - Cookbook sre.ganeti.reboot-vm for VM planet1002.eqiad.wmnet	[production]
12:45	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
12:41	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
12:41	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
12:37	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
12:21	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1181 (T297191)', diff saved to https://phabricator.wikimedia.org/P18563 and previous config saved to /var/cache/conftool/dbconfig/20220111-122143-marostegui.json	[production]
12:17	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
12:15	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
12:15	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
12:15	<cparle@deploy1002>	Synchronized wmf-config: Config: [[gerrit:752599\|Enable support for references (T230315)]] (duration: 01m 00s)	[production]
12:14	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kubetcd2004.codfw.wmnet with reason: switch to plain disk storage	[production]
12:14	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kubetcd2004.codfw.wmnet with reason: switch to plain disk storage	[production]
12:14	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
12:10	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1104 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18562 and previous config saved to /var/cache/conftool/dbconfig/20220111-121025-root.json	[production]
12:06	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18561 and previous config saved to /var/cache/conftool/dbconfig/20220111-120638-marostegui.json	[production]
12:00	<moritzm>	reverting kubetcd2004.codfw.wmnet back to "plain" storage	[production]
11:56	<moritzm>	rebalance ganeti row A (all nodes reimaged to Buster)	[production]
11:55	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1104 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18560 and previous config saved to /var/cache/conftool/dbconfig/20220111-115522-root.json	[production]
11:51	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18559 and previous config saved to /var/cache/conftool/dbconfig/20220111-115133-marostegui.json	[production]
11:41	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2019.codfw.wmnet	[production]
11:40	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1104 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18558 and previous config saved to /var/cache/conftool/dbconfig/20220111-114018-root.json	[production]
11:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1181 (T297191)', diff saved to https://phabricator.wikimedia.org/P18557 and previous config saved to /var/cache/conftool/dbconfig/20220111-113628-marostegui.json	[production]
11:35	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti2019.codfw.wmnet	[production]
11:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1181 (T297191)', diff saved to https://phabricator.wikimedia.org/P18556 and previous config saved to /var/cache/conftool/dbconfig/20220111-113216-marostegui.json	[production]
11:32	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance	[production]
11:32	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance	[production]
11:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1158 (T297191)', diff saved to https://phabricator.wikimedia.org/P18555 and previous config saved to /var/cache/conftool/dbconfig/20220111-113208-marostegui.json	[production]