production SAL

1001-1050 of 10000 results (36ms)

2022-02-28 §
06:02	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS bullseye	[production]
05:57	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1178 (T302185)', diff saved to https://phabricator.wikimedia.org/P21549 and previous config saved to /var/cache/conftool/dbconfig/20220228-055626-ladsgroup.json	[production]
05:56	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance	[production]
05:56	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance	[production]
05:55	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21548 and previous config saved to /var/cache/conftool/dbconfig/20220228-055530-ladsgroup.json	[production]
05:52	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P21547 and previous config saved to /var/cache/conftool/dbconfig/20220228-055226-ladsgroup.json	[production]
05:40	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21546 and previous config saved to /var/cache/conftool/dbconfig/20220228-054025-ladsgroup.json	[production]
05:38	<ladsgroup@deploy1002>	Synchronized php-1.38.0-wmf.23/includes/content/ContentHandler.php: Backport: [[gerrit:766136\|ContentHandler: Use ParserOutputAccess for accessing ParserOutput (T302620)]] (duration: 00m 49s)	[production]
05:37	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300992)', diff saved to https://phabricator.wikimedia.org/P21545 and previous config saved to /var/cache/conftool/dbconfig/20220228-053721-ladsgroup.json	[production]
05:25	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21544 and previous config saved to /var/cache/conftool/dbconfig/20220228-052521-ladsgroup.json	[production]
05:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1166 (T300992)', diff saved to https://phabricator.wikimedia.org/P21543 and previous config saved to /var/cache/conftool/dbconfig/20220228-051905-ladsgroup.json	[production]
05:19	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance	[production]
05:18	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance	[production]
05:10	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21542 and previous config saved to /var/cache/conftool/dbconfig/20220228-051016-ladsgroup.json	[production]
05:05	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS bullseye	[production]
04:56	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance	[production]
04:56	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance	[production]
04:55	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance	[production]
04:55	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance	[production]
04:49	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage	[production]
04:46	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage	[production]
04:35	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS bullseye	[production]
04:30	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21541 and previous config saved to /var/cache/conftool/dbconfig/20220228-043003-ladsgroup.json	[production]
04:30	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance	[production]
04:29	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance	[production]
2022-02-27 §
20:42	<XioNoX>	configure OSPF between cr2-drmrs and cr2-eqdfw	[production]
2022-02-25 §
23:32	<dzahn@deploy1002>	helmfile [staging] DONE helmfile.d/services/miscweb: apply	[production]
23:30	<dzahn@deploy1002>	helmfile [staging] START helmfile.d/services/miscweb: apply	[production]
21:37	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21540 and previous config saved to /var/cache/conftool/dbconfig/20220225-213704-ladsgroup.json	[production]
21:22	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21539 and previous config saved to /var/cache/conftool/dbconfig/20220225-212159-ladsgroup.json	[production]
21:06	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21538 and previous config saved to /var/cache/conftool/dbconfig/20220225-210654-ladsgroup.json	[production]
21:02	<ryankemper>	[WDQS] Restarted wdqs eqiad exporters: `ryankemper@cumin1001:~$ sudo -E cumin -b 1 'wdqs1*' 'systemctl restart prometheus-blazegraph-exporter-wdqs-blazegraph.service'`	[production]
21:01	<ryankemper>	[WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good. Still looking into `Reduced availability for job jmx_wdqs_updater`; will try restarting blazegraph exporters in eqiad	[production]
20:51	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21537 and previous config saved to /var/cache/conftool/dbconfig/20220225-205149-ladsgroup.json	[production]
20:48	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21536 and previous config saved to /var/cache/conftool/dbconfig/20220225-204844-ladsgroup.json	[production]
20:48	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance	[production]
20:48	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance	[production]
20:48	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21535 and previous config saved to /var/cache/conftool/dbconfig/20220225-204836-ladsgroup.json	[production]
20:33	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21534 and previous config saved to /var/cache/conftool/dbconfig/20220225-203331-ladsgroup.json	[production]
20:31	<ryankemper>	[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`	[production]
20:31	<ryankemper>	[WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`	[production]
20:31	<ryankemper>	[WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`	[production]
20:30	<ryankemper@deploy1002>	Finished deploy [wdqs/wdqs@5d384a5]: 0.3.104 (duration: 07m 18s)	[production]
20:23	<ryankemper>	[WDQS Deploy] Tests passing following deploy of `0.3.104` on canary `wdqs1003`; proceeding to rest of fleet	[production]
20:22	<ryankemper@deploy1002>	Started deploy [wdqs/wdqs@5d384a5]: 0.3.104	[production]
20:22	<ryankemper>	[WDQS Deploy] Gearing up for deploy of wdqs `0.3.104`. Pre-deploy tests passing on canary `wdqs1003`	[production]
20:18	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21533 and previous config saved to /var/cache/conftool/dbconfig/20220225-201826-ladsgroup.json	[production]
20:03	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21532 and previous config saved to /var/cache/conftool/dbconfig/20220225-200322-ladsgroup.json	[production]
19:59	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21531 and previous config saved to /var/cache/conftool/dbconfig/20220225-195917-ladsgroup.json	[production]
19:59	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]