production SAL

1101-1150 of 10000 results (26ms)

2021-01-21 §
11:50	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1085 (re)pooling @ 50%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13870 and previous config saved to /var/cache/conftool/dbconfig/20210121-115036-root.json	[production]
11:35	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1085 (re)pooling @ 25%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13868 and previous config saved to /var/cache/conftool/dbconfig/20210121-113533-root.json	[production]
11:29	<marostegui>	Stop replication on db1085 to move wiki replicas under the other sanitarium host	[production]
11:28	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1085', diff saved to https://phabricator.wikimedia.org/P13867 and previous config saved to /var/cache/conftool/dbconfig/20210121-112849-marostegui.json	[production]
11:12	<hnowlan@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' .	[production]
11:12	<hnowlan@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' .	[production]
09:44	<hoo>	Updated the Wikidata property suggester with data from the 2021-01-11 JSON dump and applied the T132839 workarounds	[production]
09:00	<marostegui>	m1 master restart - T271540	[production]
08:51	<jynus>	stopping puppet and bacula for backup1001 T271540	[production]
08:43	<godog>	swift codfw-prod: more weight to ms-be20[58-61] - T269337	[production]
08:37	<marostegui>	Silence m1 hosts in preparation for the restart T271540	[production]
08:34	<godog>	roll-restart swift-object in codfw to apply new concurrency	[production]
07:21	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P13864 and previous config saved to /var/cache/conftool/dbconfig/20210121-072101-marostegui.json	[production]
07:03	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repoool db1099:3318', diff saved to https://phabricator.wikimedia.org/P13863 and previous config saved to /var/cache/conftool/dbconfig/20210121-070346-marostegui.json	[production]
06:55	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repoool db1099:3318', diff saved to https://phabricator.wikimedia.org/P13862 and previous config saved to /var/cache/conftool/dbconfig/20210121-065459-marostegui.json	[production]
06:54	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db1087', diff saved to https://phabricator.wikimedia.org/P13861 and previous config saved to /var/cache/conftool/dbconfig/20210121-065408-marostegui.json	[production]
06:49	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1087 and pool db1099:3318 into s8 vslow', diff saved to https://phabricator.wikimedia.org/P13860 and previous config saved to /var/cache/conftool/dbconfig/20210121-064903-marostegui.json	[production]
03:54	<milimetric@deploy1001>	deploy aborted: Minor typo fix (duration: 01m 39s)	[production]
03:52	<milimetric@deploy1001>	Started deploy [analytics/refinery@57589e7]: Minor typo fix	[production]
01:27	<ryankemper>	[WDQS Deploy] Rollback complete, service health of `wdqs1003` is restored. Need to investigate source of 404 (possibly related to some recent changes we made in the `gui` repo)	[production]
01:26	<ryankemper@deploy1001>	Finished deploy [wdqs/wdqs@70f9d37]: 0.3.60 (duration: 02m 53s)	[production]
01:26	<ryankemper>	[WDQS Deploy] Rollback of canary `wdqs1003` initiated	[production]
01:25	<ryankemper>	[WDQS Deploy] Automated tests passing on canary`wdqs1003` but manually visiting `http://localhost:9999` (my tunnel to `wdqs1003`) gives `404 Not Found`from nginx; aborting deploy	[production]
01:23	<ryankemper@deploy1001>	Started deploy [wdqs/wdqs@70f9d37]: 0.3.60	[production]
01:22	<ryankemper>	[WDQS Deploy] Tests on canary `wdqs1003` passing before start of deploy, proceeding with deploy of wdqs `0.3.60` to canary	[production]
00:44	<legoktm>	legoktm@mwmaint1002:~$ mwscript initSiteStats.php --wiki=trwikivoyage --update	[production]
00:19	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2369.codfw.wmnet	[production]
00:19	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2367.codfw.wmnet	[production]
00:19	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2365.codfw.wmnet	[production]
00:19	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2363.codfw.wmnet	[production]
00:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2369.codfw.wmnet	[production]
00:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2365.codfw.wmnet	[production]
00:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2367.codfw.wmnet	[production]
00:17	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2363.codfw.wmnet	[production]
2021-01-20 §
23:51	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2369.codfw.wmnet with reason: REIMAGE	[production]
23:51	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2365.codfw.wmnet with reason: REIMAGE	[production]
23:49	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2367.codfw.wmnet with reason: REIMAGE	[production]
23:47	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2363.codfw.wmnet with reason: REIMAGE	[production]
23:47	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw2369.codfw.wmnet with reason: REIMAGE	[production]
23:47	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw2367.codfw.wmnet with reason: REIMAGE	[production]
23:46	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw2365.codfw.wmnet with reason: REIMAGE	[production]
23:45	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw2363.codfw.wmnet with reason: REIMAGE	[production]
23:30	<mutante>	releases2002 - rebooting VM	[production]
23:25	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2361.codfw.wmnet	[production]
23:25	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2359.codfw.wmnet	[production]
23:25	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2355.codfw.wmnet	[production]
23:25	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2357.codfw.wmnet	[production]
23:22	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on releases2002.codfw.wmnet with reason: rebooting to add a disk	[production]
23:22	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on releases2002.codfw.wmnet with reason: rebooting to add a disk	[production]
23:10	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2357.codfw.wmnet	[production]