production SAL

1301-1350 of 10000 results (48ms)

2022-01-25 §
10:24	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
10:24	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance	[production]
10:24	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance	[production]
10:24	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T299827)', diff saved to https://phabricator.wikimedia.org/P19116 and previous config saved to /var/cache/conftool/dbconfig/20220125-102426-marostegui.json	[production]
10:18	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1013.eqiad.wmnet	[production]
10:13	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti1013.eqiad.wmnet	[production]
10:11	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1143 (T285149)', diff saved to https://phabricator.wikimedia.org/P19115 and previous config saved to /var/cache/conftool/dbconfig/20220125-101114-marostegui.json	[production]
10:09	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
10:09	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19114 and previous config saved to /var/cache/conftool/dbconfig/20220125-100921-marostegui.json	[production]
10:09	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1143 (T285149)', diff saved to https://phabricator.wikimedia.org/P19113 and previous config saved to /var/cache/conftool/dbconfig/20220125-100907-marostegui.json	[production]
10:09	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance	[production]
10:09	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance	[production]
10:09	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T285149)', diff saved to https://phabricator.wikimedia.org/P19112 and previous config saved to /var/cache/conftool/dbconfig/20220125-100900-marostegui.json	[production]
10:08	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
10:08	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
10:06	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
10:04	<taavi@deploy1002>	Synchronized wmf-config/extension-list: Config: [[gerrit:755534\|Undeploy UserMerge (3) (T216089)]] (duration: 00m 48s)	[production]
10:03	<marostegui@cumin1001>	START - Cookbook sre.hosts.reimage for host es2020.codfw.wmnet with OS bullseye	[production]
10:02	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.reimage for host es2029.codfw.wmnet with OS bullseye	[production]
10:01	<taavi@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:755533\|Undeploy UserMerge (2) (T216089)]] (duration: 00m 49s)	[production]
10:01	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
10:00	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
10:00	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2029.codfw.wmnet with reason: reimage for upgrade - T299911	[production]
10:00	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
10:00	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool es2020', diff saved to https://phabricator.wikimedia.org/P19111 and previous config saved to /var/cache/conftool/dbconfig/20220125-100036-marostegui.json	[production]
10:00	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2029.codfw.wmnet with reason: reimage for upgrade - T299911	[production]
09:59	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]
09:59	<taavi@deploy1002>	Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:755532\|Undeploy UserMerge (1) (T216089)]] (duration: 00m 49s)	[production]
09:54	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19110 and previous config saved to /var/cache/conftool/dbconfig/20220125-095417-marostegui.json	[production]
09:53	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P19109 and previous config saved to /var/cache/conftool/dbconfig/20220125-095355-marostegui.json	[production]
09:40	<jayme@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
09:40	<jayme@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
09:40	<mmandere@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir6001.drmrs.wmnet	[production]
09:39	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T299827)', diff saved to https://phabricator.wikimedia.org/P19108 and previous config saved to /var/cache/conftool/dbconfig/20220125-093912-marostegui.json	[production]
09:38	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P19107 and previous config saved to /var/cache/conftool/dbconfig/20220125-093850-marostegui.json	[production]
09:38	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1098:3317 (T299827)', diff saved to https://phabricator.wikimedia.org/P19106 and previous config saved to /var/cache/conftool/dbconfig/20220125-093806-marostegui.json	[production]
09:38	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance	[production]
09:38	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance	[production]
09:23	<mmandere@cumin1001>	START - Cookbook sre.ganeti.makevm for new host ncredir6001.drmrs.wmnet	[production]
09:23	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T285149)', diff saved to https://phabricator.wikimedia.org/P19105 and previous config saved to /var/cache/conftool/dbconfig/20220125-092346-marostegui.json	[production]
09:23	<dcausse>	restarting blazegraph on wdqs1004 (jvm stuck for 1h)	[production]
09:11	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1013.eqiad.wmnet with OS buster	[production]
08:52	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1030 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19104 and previous config saved to /var/cache/conftool/dbconfig/20220125-085228-root.json	[production]
08:45	<moritzm>	draining instances off ganeti1005 for reimage	[production]
08:44	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host ganeti1013.eqiad.wmnet with OS buster	[production]
08:37	<marostegui@cumin1001>	dbctl commit (dc=all): 'es1030 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19103 and previous config saved to /var/cache/conftool/dbconfig/20220125-083724-root.json	[production]
08:33	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn	[production]
08:32	<jayme>	kubernetes staging migrated tainted worker node setup - T290967	[production]
08:32	<jayme@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubestagemaster1001.eqiad.wmnet	[production]
08:32	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn	[production]