production SAL

3651-3700 of 10000 results (76ms)

2022-10-11 §
13:01	<jgiannelos@deploy1002>	helmfile [codfw] START helmfile.d/services/mobileapps: apply	[production]
13:01	<jgiannelos@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply	[production]
13:00	<jgiannelos@deploy1002>	helmfile [eqiad] START helmfile.d/services/mobileapps: apply	[production]
12:59	<jgiannelos@deploy1002>	helmfile [staging] DONE helmfile.d/services/mobileapps: apply	[production]
12:58	<jgiannelos@deploy1002>	helmfile [staging] START helmfile.d/services/mobileapps: apply	[production]
12:46	<vgutierrez>	partitioning the ATS cache in cp[2035-2036], cp[6004,6012], cp[1083-1084], cp[5005,5011], cp[3058-3059], cp[4025,4029] - T317748	[production]
12:39	<volans@cumin2002>	START - Cookbook sre.hosts.provision for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
12:05	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2110 (T314041)', diff saved to https://phabricator.wikimedia.org/P35397 and previous config saved to /var/cache/conftool/dbconfig/20221011-120514-ladsgroup.json	[production]
11:50	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P35396 and previous config saved to /var/cache/conftool/dbconfig/20221011-115007-ladsgroup.json	[production]
11:35	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P35395 and previous config saved to /var/cache/conftool/dbconfig/20221011-113501-ladsgroup.json	[production]
11:27	<jmm@cumin2002>	END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1032.eqiad.wmnet to cluster eqiad and group A	[production]
11:26	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti1032.eqiad.wmnet to cluster eqiad and group A	[production]
11:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2110 (T314041)', diff saved to https://phabricator.wikimedia.org/P35394 and previous config saved to /var/cache/conftool/dbconfig/20221011-111954-ladsgroup.json	[production]
11:19	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet	[production]
11:13	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
11:12	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
11:12	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
11:11	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
11:10	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet	[production]
10:41	<volans@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
10:13	<volans@cumin2002>	START - Cookbook sre.hosts.provision for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
10:12	<volans@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
10:08	<volans@cumin2002>	START - Cookbook sre.hosts.provision for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
10:07	<volans@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
10:06	<volans@cumin2002>	START - Cookbook sre.hosts.provision for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
10:02	<volans@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
09:57	<volans@cumin2002>	START - Cookbook sre.hosts.provision for host lvs4008.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
09:44	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti1006.eqiad.wmnet with reason: Remove from cluster for decom	[production]
09:44	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti1006.eqiad.wmnet with reason: Remove from cluster for decom	[production]
08:53	<vgutierrez>	partitioning the ATS cache in cp1085, cp1086, cp2037, cp2038, cp3060, cp3061, cp4026, cp4030, cp5006, cp5012, cp6005, cp6013 - T317748	[production]
08:37	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ganeti4008.ulsfo.wmnet	[production]
07:41	<elukey@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .	[production]
07:40	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .	[production]
07:31	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .	[production]
07:30	<elukey@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .	[production]
07:24	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .	[production]
07:22	<elukey@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .	[production]
07:21	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .	[production]
07:21	<elukey@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .	[production]
07:18	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .	[production]
07:18	<elukey@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .	[production]
07:17	<ryankemper>	[Elastic] Forcing recheck of elastic settings check alerts; expecting a bit of noise as the alerts resolve (hopefully)	[production]
07:17	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet	[production]
07:17	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .	[production]
07:16	<ryankemper>	[Elastic] Updated cross-cluster remote seeds (masters): `ryankemper@mwmaint1002:~/elastic$ python push_cross_cluster_conf.py https://search.svc.eqiad.wmnet:9[2,4,6]43/_cluster/settings --ccc chi=chi_eqiad_masters.lst psi=psi_eqiad_masters.lst omega=omega_eqiad_masters.lst`	[production]
07:15	<elukey@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .	[production]
07:12	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .	[production]
07:11	<elukey@deploy1002>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .	[production]
07:09	<kartik@deploy1002>	Finished scap: Backport for [[gerrit:839411\|ContentTranslation: Make Mongolian Wikipedia MT stricter by 10% (T319156)]] (duration: 08m 56s)	[production]
07:01	<kartik@deploy1002>	kartik and kartik: Backport for [[gerrit:839411\|ContentTranslation: Make Mongolian Wikipedia MT stricter by 10% (T319156)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet	[production]