production SAL

4651-4700 of 10000 results (83ms)

2022-08-25 §
20:18	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance	[production]
20:18	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance	[production]
20:17	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2115 (T312160)', diff saved to https://phabricator.wikimedia.org/P33155 and previous config saved to /var/cache/conftool/dbconfig/20220825-201756-ladsgroup.json	[production]
20:17	<urbanecm>	[urbanecm@deploy1002 ~]$ rm /var/lock/scap.operations_mediawiki-config.lock # connection to deploy1002 handled, to let me re-sync	[production]
20:14	<urandom>	re-rebooting ms-be2067 to "fix" disk enumeration(?) -- T314049	[production]
20:14	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:13	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:13	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:12	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:11	<bking@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
20:11	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33154 and previous config saved to /var/cache/conftool/dbconfig/20220825-201141-ladsgroup.json	[production]
20:07	<bking@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
20:02	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P33153 and previous config saved to /var/cache/conftool/dbconfig/20220825-200250-ladsgroup.json	[production]
19:56	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33152 and previous config saved to /var/cache/conftool/dbconfig/20220825-195635-ladsgroup.json	[production]
19:47	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P33151 and previous config saved to /var/cache/conftool/dbconfig/20220825-194744-ladsgroup.json	[production]
19:42	<bking@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
19:41	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33150 and previous config saved to /var/cache/conftool/dbconfig/20220825-194129-ladsgroup.json	[production]
19:41	<bking@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
19:37	<andrew@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudservices1003	[production]
19:37	<andrew@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
19:36	<urandom>	rebooting ms-be2067 to "fix" disk enumeration(?) -- T314049	[production]
19:35	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33149 and previous config saved to /var/cache/conftool/dbconfig/20220825-193513-ladsgroup.json	[production]
19:35	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
19:34	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
19:34	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance	[production]
19:34	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance	[production]
19:34	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33148 and previous config saved to /var/cache/conftool/dbconfig/20220825-193430-ladsgroup.json	[production]
19:33	<andrew@cumin1001>	START - Cookbook sre.dns.netbox	[production]
19:32	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2115 (T312160)', diff saved to https://phabricator.wikimedia.org/P33147 and previous config saved to /var/cache/conftool/dbconfig/20220825-193238-ladsgroup.json	[production]
19:29	<andrew@cumin1001>	START - Cookbook sre.hosts.decommission for hosts cloudservices1003	[production]
19:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33146 and previous config saved to /var/cache/conftool/dbconfig/20220825-191924-ladsgroup.json	[production]
19:04	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33145 and previous config saved to /var/cache/conftool/dbconfig/20220825-190417-ladsgroup.json	[production]
18:49	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33144 and previous config saved to /var/cache/conftool/dbconfig/20220825-184911-ladsgroup.json	[production]
18:48	<bking@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
18:48	<ebernhardson@deploy1002>	Finished deploy [wikimedia/discovery/analytics@d00af45]: bump elasticsearch-hadoop to 7.10.2 (duration: 02m 07s)	[production]
18:47	<bking@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
18:45	<ebernhardson@deploy1002>	Started deploy [wikimedia/discovery/analytics@d00af45]: bump elasticsearch-hadoop to 7.10.2	[production]
18:43	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33143 and previous config saved to /var/cache/conftool/dbconfig/20220825-184301-ladsgroup.json	[production]
18:42	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance	[production]
18:42	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance	[production]
18:42	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1131 (T316186)', diff saved to https://phabricator.wikimedia.org/P33142 and previous config saved to /var/cache/conftool/dbconfig/20220825-184233-ladsgroup.json	[production]
18:36	<otto@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: sync	[production]
18:36	<otto@deploy1002>	helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: sync	[production]
18:35	<otto@deploy1002>	helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: sync	[production]
18:34	<otto@deploy1002>	helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: sync	[production]
18:34	<otto@deploy1002>	helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: sync	[production]
18:33	<otto@deploy1002>	helmfile [staging] START helmfile.d/services/eventgate-analytics-external: sync	[production]
18:33	<ottomata>	rolling restart of eventgate-analytics-external to pick up retroactive schema change for android schemas in T316047	[production]
18:27	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P33141 and previous config saved to /var/cache/conftool/dbconfig/20220825-182727-ladsgroup.json	[production]
18:19	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: (no justification provided)	[production]