production SAL

1301-1350 of 10000 results (50ms)

2022-08-25 §
19:56	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33152 and previous config saved to /var/cache/conftool/dbconfig/20220825-195635-ladsgroup.json	[production]
19:47	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P33151 and previous config saved to /var/cache/conftool/dbconfig/20220825-194744-ladsgroup.json	[production]
19:42	<bking@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
19:41	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33150 and previous config saved to /var/cache/conftool/dbconfig/20220825-194129-ladsgroup.json	[production]
19:41	<bking@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
19:37	<andrew@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudservices1003	[production]
19:37	<andrew@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
19:36	<urandom>	rebooting ms-be2067 to "fix" disk enumeration(?) -- T314049	[production]
19:35	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33149 and previous config saved to /var/cache/conftool/dbconfig/20220825-193513-ladsgroup.json	[production]
19:35	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
19:34	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
19:34	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance	[production]
19:34	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance	[production]
19:34	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33148 and previous config saved to /var/cache/conftool/dbconfig/20220825-193430-ladsgroup.json	[production]
19:33	<andrew@cumin1001>	START - Cookbook sre.dns.netbox	[production]
19:32	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2115 (T312160)', diff saved to https://phabricator.wikimedia.org/P33147 and previous config saved to /var/cache/conftool/dbconfig/20220825-193238-ladsgroup.json	[production]
19:29	<andrew@cumin1001>	START - Cookbook sre.hosts.decommission for hosts cloudservices1003	[production]
19:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33146 and previous config saved to /var/cache/conftool/dbconfig/20220825-191924-ladsgroup.json	[production]
19:04	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33145 and previous config saved to /var/cache/conftool/dbconfig/20220825-190417-ladsgroup.json	[production]
18:49	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33144 and previous config saved to /var/cache/conftool/dbconfig/20220825-184911-ladsgroup.json	[production]
18:48	<bking@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
18:48	<ebernhardson@deploy1002>	Finished deploy [wikimedia/discovery/analytics@d00af45]: bump elasticsearch-hadoop to 7.10.2 (duration: 02m 07s)	[production]
18:47	<bking@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159	[production]
18:45	<ebernhardson@deploy1002>	Started deploy [wikimedia/discovery/analytics@d00af45]: bump elasticsearch-hadoop to 7.10.2	[production]
18:43	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33143 and previous config saved to /var/cache/conftool/dbconfig/20220825-184301-ladsgroup.json	[production]
18:42	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance	[production]
18:42	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance	[production]
18:42	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1131 (T316186)', diff saved to https://phabricator.wikimedia.org/P33142 and previous config saved to /var/cache/conftool/dbconfig/20220825-184233-ladsgroup.json	[production]
18:36	<otto@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: sync	[production]
18:36	<otto@deploy1002>	helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: sync	[production]
18:35	<otto@deploy1002>	helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: sync	[production]
18:34	<otto@deploy1002>	helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: sync	[production]
18:34	<otto@deploy1002>	helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: sync	[production]
18:33	<otto@deploy1002>	helmfile [staging] START helmfile.d/services/eventgate-analytics-external: sync	[production]
18:33	<ottomata>	rolling restart of eventgate-analytics-external to pick up retroactive schema change for android schemas in T316047	[production]
18:27	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P33141 and previous config saved to /var/cache/conftool/dbconfig/20220825-182727-ladsgroup.json	[production]
18:19	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: (no justification provided)	[production]
18:18	<bmansurov@deploy1002>	Finished deploy [airflow-dags/research@5712187]: (no justification provided) (duration: 00m 09s)	[production]
18:18	<bmansurov@deploy1002>	Started deploy [airflow-dags/research@5712187]: (no justification provided)	[production]
18:13	<dancy@deploy1002>	Installation of scap version "4.15.0" completed for 557 hosts	[production]
18:12	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P33140 and previous config saved to /var/cache/conftool/dbconfig/20220825-181221-ladsgroup.json	[production]
18:11	<dancy@deploy1002>	Installing scap version "4.15.0" for 557 hosts	[production]
18:11	<dancy@deploy1002>	install-world aborted: (duration: 00m 02s)	[production]
17:57	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1131 (T316186)', diff saved to https://phabricator.wikimedia.org/P33139 and previous config saved to /var/cache/conftool/dbconfig/20220825-175715-ladsgroup.json	[production]
17:49	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1131 (T316186)', diff saved to https://phabricator.wikimedia.org/P33138 and previous config saved to /var/cache/conftool/dbconfig/20220825-174946-ladsgroup.json	[production]
17:49	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1131.eqiad.wmnet with reason: Maintenance	[production]
17:49	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1131.eqiad.wmnet with reason: Maintenance	[production]
17:48	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db2115 (T312160)', diff saved to https://phabricator.wikimedia.org/P33137 and previous config saved to /var/cache/conftool/dbconfig/20220825-174826-ladsgroup.json	[production]
17:48	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2115.codfw.wmnet with reason: Maintenance	[production]
17:47	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on db2115.codfw.wmnet with reason: Maintenance	[production]