production SAL

3051-3100 of 10000 results (51ms)

2022-03-10 §
11:36	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance	[production]
11:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P22293 and previous config saved to /var/cache/conftool/dbconfig/20220310-113210-marostegui.json	[production]
11:29	<ebysans@deploy1002>	Finished deploy [airflow-dags/analytics@b681376]: (no justification provided) (duration: 00m 07s)	[production]
11:29	<ebysans@deploy1002>	Started deploy [airflow-dags/analytics@b681376]: (no justification provided)	[production]
11:26	<mvolz@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/zotero: apply	[production]
11:26	<mvolz@deploy1002>	helmfile [eqiad] START helmfile.d/services/zotero: apply	[production]
11:25	<mvolz@deploy1002>	helmfile [codfw] DONE helmfile.d/services/zotero: apply	[production]
11:25	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1007.eqiad.wmnet	[production]
11:25	<jmm@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host elastic1093.eqiad.wmnet	[production]
11:24	<mvolz@deploy1002>	helmfile [codfw] START helmfile.d/services/zotero: apply	[production]
11:24	<mvolz@deploy1002>	helmfile [staging] DONE helmfile.d/services/zotero: apply	[production]
11:24	<mvolz@deploy1002>	helmfile [staging] START helmfile.d/services/zotero: apply	[production]
11:20	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host dumpsdata1007.eqiad.wmnet	[production]
11:18	<volans>	rolled out python3-wmflib v1.1.2 to the entire fleet (buster+ only)	[production]
11:17	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300775)', diff saved to https://phabricator.wikimedia.org/P22292 and previous config saved to /var/cache/conftool/dbconfig/20220310-111705-marostegui.json	[production]
11:16	<jmm@cumin1001>	START - Cookbook sre.hosts.reboot-single for host elastic1093.eqiad.wmnet	[production]
11:14	<jmm@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1001.wikimedia.org	[production]
11:13	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1106 (T298294)', diff saved to https://phabricator.wikimedia.org/P22291 and previous config saved to /var/cache/conftool/dbconfig/20220310-111330-marostegui.json	[production]
11:13	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1126 (T300775)', diff saved to https://phabricator.wikimedia.org/P22290 and previous config saved to /var/cache/conftool/dbconfig/20220310-111320-marostegui.json	[production]
11:13	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
11:13	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1126.eqiad.wmnet with reason: Maintenance	[production]
11:13	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
11:13	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance	[production]
11:13	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on db1126.eqiad.wmnet with reason: Maintenance	[production]
11:13	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300775)', diff saved to https://phabricator.wikimedia.org/P22289 and previous config saved to /var/cache/conftool/dbconfig/20220310-111313-marostegui.json	[production]
11:13	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance	[production]
11:12	<mvolz@deploy1002>	helmfile [staging] DONE helmfile.d/services/zotero: apply	[production]
11:10	<mvolz@deploy1002>	helmfile [staging] START helmfile.d/services/zotero: apply	[production]
11:10	<jmm@cumin1001>	START - Cookbook sre.hosts.reboot-single for host idp-test1001.wikimedia.org	[production]
11:09	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet	[production]
11:08	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on 14 hosts with reason: Maintenance	[production]
11:08	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 16:00:00 on 14 hosts with reason: Maintenance	[production]
11:08	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance	[production]
11:08	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance	[production]
11:06	<mvolz@deploy1002>	helmfile [staging] DONE helmfile.d/services/zotero: apply	[production]
11:04	<mvolz@deploy1002>	helmfile [staging] START helmfile.d/services/zotero: apply	[production]
11:02	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance	[production]
11:02	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance	[production]
11:02	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298294)', diff saved to https://phabricator.wikimedia.org/P22287 and previous config saved to /var/cache/conftool/dbconfig/20220310-110253-marostegui.json	[production]
10:58	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P22286 and previous config saved to /var/cache/conftool/dbconfig/20220310-105807-marostegui.json	[production]
10:48	<jbond>	re-enable puppet fleet wide	[production]
10:47	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P22285 and previous config saved to /var/cache/conftool/dbconfig/20220310-104748-marostegui.json	[production]
10:47	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet	[production]
10:44	<akosiaris>	reboot rdb2009 for upgrades	[production]
10:44	<jbond>	disable puppet fleet wide	[production]
10:43	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P22284 and previous config saved to /var/cache/conftool/dbconfig/20220310-104302-marostegui.json	[production]
10:42	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2010.codfw.wmnet with OS bullseye	[production]
10:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P22283 and previous config saved to /var/cache/conftool/dbconfig/20220310-103243-marostegui.json	[production]
10:30	<moritzm>	failover ganeti master for drmrs/B13 to ganeti6004	[production]
10:29	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2010.codfw.wmnet with reason: host reimage	[production]