production SAL

1301-1350 of 10000 results (31ms)

2021-01-14 §
08:52	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1138 (re)pooling @ 25%: After restarting mysql', diff saved to https://phabricator.wikimedia.org/P13765 and previous config saved to /var/cache/conftool/dbconfig/20210114-085252-root.json	[production]
08:52	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.reboot-single	[production]
08:51	<vgutierrez>	rolling restart of ncredir servers to catch up on kernel upgrades	[production]
08:47	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)	[production]
08:44	<jmm@cumin2001>	START - Cookbook sre.hosts.reboot-single	[production]
08:44	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)	[production]
08:43	<XioNoX>	standardize cloudsw interfaces to prepare for switches homerisation	[production]
08:42	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db2140 T271084', diff saved to https://phabricator.wikimedia.org/P13764 and previous config saved to /var/cache/conftool/dbconfig/20210114-084243-marostegui.json	[production]
08:39	<jmm@cumin2001>	START - Cookbook sre.hosts.reboot-single	[production]
08:10	<jmm@cumin2001>	END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)	[production]
08:10	<jmm@cumin2001>	START - Cookbook sre.ganeti.makevm	[production]
00:22	<ryankemper>	T266492 Restart of `relforge` successful	[production]
00:20	<ryankemper@cumin1001>	END (PASS) - Cookbook sre.elasticsearch.rolling-restart (exit_code=0)	[production]
00:15	<chaomodus>	completed rebooting Netbox hosts, failure was due to report errors that would not have recovered.	[production]
00:14	<crusnov@cumin1001>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)	[production]
00:13	<ryankemper>	`sudo -i cookbook sre.elasticsearch.rolling-restart relforge "relforge cluster restart" --task-id T266492 --nodes-per-run 1 --without-lvs`	[production]
00:13	<ryankemper>	(Forgot to tell it `relforge` isn't lvs-managed)	[production]
00:13	<ryankemper@cumin1001>	START - Cookbook sre.elasticsearch.rolling-restart	[production]
00:10	<ryankemper@cumin1001>	END (FAIL) - Cookbook sre.elasticsearch.rolling-restart (exit_code=99)	[production]
00:10	<ryankemper>	T266492 Beginning rolling restart of `relforge`	[production]
00:09	<ryankemper@cumin1001>	START - Cookbook sre.elasticsearch.rolling-restart	[production]
00:04	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2240.codfw.wmnet	[production]
00:04	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2239.codfw.wmnet	[production]
00:04	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2238.codfw.wmnet	[production]
00:04	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2237.codfw.wmnet	[production]
00:01	<crusnov@cumin1001>	START - Cookbook sre.hosts.reboot-single	[production]
00:01	<crusnov@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)	[production]
00:00	<ryankemper>	T266492 T268779 T265699 Rolling restart of `cloudelastic` was successful	[production]
2021-01-13 §
23:53	<crusnov@cumin1001>	START - Cookbook sre.hosts.reboot-single	[production]
23:53	<crusnov@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)	[production]
23:49	<ryankemper@cumin1001>	END (PASS) - Cookbook sre.elasticsearch.rolling-restart (exit_code=0)	[production]
23:49	<crusnov@cumin1001>	START - Cookbook sre.hosts.reboot-single	[production]
23:49	<crusnov@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)	[production]
23:46	<crusnov@cumin1001>	START - Cookbook sre.hosts.reboot-single	[production]
23:46	<crusnov@cumin1001>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99)	[production]
23:46	<crusnov@cumin1001>	START - Cookbook sre.hosts.reboot-single	[production]
23:44	<chaomodus>	rebooting Netbox instances to apply updates	[production]
23:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2240.codfw.wmnet	[production]
23:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2239.codfw.wmnet	[production]
23:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2238.codfw.wmnet	[production]
23:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2237.codfw.wmnet	[production]
22:53	<ryankemper@cumin1001>	START - Cookbook sre.elasticsearch.rolling-restart	[production]
22:53	<ryankemper>	T266492 T268779 T265699 `sudo -i cookbook sre.elasticsearch.rolling-restart cloudelastic "cloudelastic cluster restart" --task-id T266492 --nodes-per-run 1`	[production]
22:53	<ryankemper>	T266492 T268779 T265699 Restarting cloudelastic to apply new readahead changes, this will also verify cloudelastic support works in our elasticsearch spicerack code. Only going one node at a time because cloudelastic elasticsearch indices only have 1 replica shard per index	[production]
21:51	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw2239.codfw.wmnet with reason: new install on buster	[production]
21:51	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on mw2239.codfw.wmnet with reason: new install on buster	[production]
21:44	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2240.codfw.wmnet with reason: REIMAGE	[production]
21:42	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2238.codfw.wmnet with reason: REIMAGE	[production]
21:41	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw2240.codfw.wmnet with reason: REIMAGE	[production]
21:40	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2239.codfw.wmnet with reason: REIMAGE	[production]