production SAL

351-400 of 10000 results (36ms)

2021-04-19 §
06:10	<_joe_>	upgrading envoy everywhere in codfw T280317	[production]
06:03	<marostegui@cumin1001>	dbctl commit (dc=all): 'Pool db1179 in s3 for the first time with minimal weight T275633', diff saved to https://phabricator.wikimedia.org/P15410 and previous config saved to /var/cache/conftool/dbconfig/20210419-060321-marostegui.json	[production]
06:01	<_joe_>	rolling out further envoy upgrades T280317	[production]
05:56	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1074 (re)pooling @ 10%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15409 and previous config saved to /var/cache/conftool/dbconfig/20210419-055613-root.json	[production]
05:53	<marostegui>	Stop sanitarium master on s2 (lag will show up on clouddb* labsdb* hosts) T272008	[production]
05:52	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1074 T272008', diff saved to https://phabricator.wikimedia.org/P15408 and previous config saved to /var/cache/conftool/dbconfig/20210419-055240-marostegui.json	[production]
05:48	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db1106', diff saved to https://phabricator.wikimedia.org/P15407 and previous config saved to /var/cache/conftool/dbconfig/20210419-054831-marostegui.json	[production]
05:42	<marostegui>	Stop sanitarium master on s1 (lag will show up on clouddb* labsdb* hosts) T272008	[production]
05:41	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1106 T272008', diff saved to https://phabricator.wikimedia.org/P15406 and previous config saved to /var/cache/conftool/dbconfig/20210419-054158-marostegui.json	[production]
05:37	<marostegui@cumin1001>	dbctl commit (dc=all): 'Pool db1179 in s3 for the first time with minimal weight T275633', diff saved to https://phabricator.wikimedia.org/P15405 and previous config saved to /var/cache/conftool/dbconfig/20210419-053730-marostegui.json	[production]
05:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'Pool db1179 in s3 for the first time with minimal weight T275633', diff saved to https://phabricator.wikimedia.org/P15404 and previous config saved to /var/cache/conftool/dbconfig/20210419-053127-marostegui.json	[production]
05:30	<marostegui@cumin1001>	dbctl commit (dc=all): 'Add db1179 to dbctl T275633', diff saved to https://phabricator.wikimedia.org/P15403 and previous config saved to /var/cache/conftool/dbconfig/20210419-053050-marostegui.json	[production]
05:05	<marostegui>	Restart m2 database master T280251	[production]
2021-04-18 §
06:40	<Amir1>	cleaning watchlist of User:Mr._Ibrahem in wikidatawiki (in main ns only)	[production]
2021-04-17 §
16:16	<Amir1>	cleaning SuccuBot's watchlist in wikidatawiki	[production]
00:53	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1307.eqiad.wmnet	[production]
00:48	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1307.eqiad.wmnet	[production]
00:23	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1402.eqiad.wmnet	[production]
00:22	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1403.eqiad.wmnet	[production]
00:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1403.eqiad.wmnet	[production]
00:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1402.eqiad.wmnet	[production]
00:14	<ryankemper>	T267927 `sudo run-puppet-agent` and `sudo pool` on `wdqs2003`	[production]
00:11	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1307.eqiad.wmnet with reason: REIMAGE	[production]
00:09	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1307.eqiad.wmnet with reason: REIMAGE	[production]
00:08	<ryankemper>	T267927 Reload of `wdqs2003` complete	[production]
00:07	<ryankemper@cumin2001>	END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)	[production]
00:00	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1403.eqiad.wmnet with reason: REIMAGE	[production]
2021-04-16 §
23:58	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mwdebug1003.eqiad.wmnet	[production]
23:58	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1402.eqiad.wmnet with reason: REIMAGE	[production]
23:56	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1403.eqiad.wmnet with reason: REIMAGE	[production]
23:56	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw1402.eqiad.wmnet with reason: REIMAGE	[production]
23:48	<dzahn@cumin1001>	conftool action : set/pooled=inactive; selector: name=mwdebug1003.eqiad.wmnet	[production]
23:47	<dzahn@cumin1001>	START - Cookbook sre.hosts.decommission for hosts mwdebug1003.eqiad.wmnet	[production]
23:47	<mutante>	decom'ing mwdebug1003, stretch VM created in T267248	[production]
23:39	<mutante>	reimaging last 3 remaining stretch appservers with buster, mw1307, mw1402, mw1403	[production]
23:37	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mw[1402-1403].eqiad.wmnet with reason: reimage	[production]
23:36	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 3:00:00 on mw[1402-1403].eqiad.wmnet with reason: reimage	[production]
21:08	<ejegg>	updated fundraising python tools from ef54260b0d to 3d950fffbd	[production]
20:40	<Trey314159>	reindexing wikidata on cloudelastic... AGAIN (T274200)	[production]
17:48	<ryankemper>	T267927 Transferring from `wdqs2008`->`wdqs2003` to resolve the data corruption on `wdqs2003`	[production]
17:47	<ryankemper@cumin2001>	START - Cookbook sre.wdqs.data-transfer	[production]
17:41	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1020.wikimedia.org with reason: REIMAGE	[production]
17:39	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1020.wikimedia.org with reason: REIMAGE	[production]
17:39	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1019.wikimedia.org with reason: REIMAGE	[production]
17:37	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1019.wikimedia.org with reason: REIMAGE	[production]
17:35	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1017.wikimedia.org with reason: REIMAGE	[production]
17:35	<mutante>	depooling mwdebug1003 (stretch VM, will be removed), mwdebug1001/1002 (buster) and unchanged	[production]
17:34	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mwdebug1003.eqiad.wmnet	[production]
17:33	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1016.wikimedia.org with reason: REIMAGE	[production]
17:33	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1017.wikimedia.org with reason: REIMAGE	[production]