production SAL

5651-5700 of 10000 results (42ms)

2021-04-14 §
11:03	<akosiaris@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .	[production]
11:03	<jiji@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1036.eqiad.wmnet with reason: REIMAGE	[production]
11:03	<akosiaris@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .	[production]
11:03	<akosiaris@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .	[production]
11:02	<akosiaris@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .	[production]
11:02	<akosiaris@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .	[production]
11:02	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1034.eqiad.wmnet with reason: REIMAGE	[production]
11:01	<jiji@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1035.eqiad.wmnet with reason: REIMAGE	[production]
10:59	<jiji@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1034.eqiad.wmnet with reason: REIMAGE	[production]
10:52	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1177 (re)pooling @ 70%: Slowly pool db1177 for the first time in s8 T275633', diff saved to https://phabricator.wikimedia.org/P15322 and previous config saved to /var/cache/conftool/dbconfig/20210414-105202-root.json	[production]
10:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1177 (re)pooling @ 60%: Slowly pool db1177 for the first time in s8 T275633', diff saved to https://phabricator.wikimedia.org/P15321 and previous config saved to /var/cache/conftool/dbconfig/20210414-103659-root.json	[production]
10:30	<marostegui>	Failover m1 from db1080 to db1159 - T276448	[production]
10:25	<dcaro@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: Upgrading ceph to octopus	[production]
10:25	<dcaro@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 6 hosts with reason: Upgrading ceph to octopus	[production]
10:21	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: Slowly pool db1177 for the first time in s8 T275633', diff saved to https://phabricator.wikimedia.org/P15320 and previous config saved to /var/cache/conftool/dbconfig/20210414-102153-root.json	[production]
10:06	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1177 (re)pooling @ 40%: Slowly pool db1177 for the first time in s8 T275633', diff saved to https://phabricator.wikimedia.org/P15319 and previous config saved to /var/cache/conftool/dbconfig/20210414-100649-root.json	[production]
09:51	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1177 (re)pooling @ 30%: Slowly pool db1177 for the first time in s8 T275633', diff saved to https://phabricator.wikimedia.org/P15318 and previous config saved to /var/cache/conftool/dbconfig/20210414-095146-root.json	[production]
09:37	<ryankemper@cumin2001>	END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)	[production]
09:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1177 (re)pooling @ 20%: Slowly pool db1177 for the first time in s8 T275633', diff saved to https://phabricator.wikimedia.org/P15317 and previous config saved to /var/cache/conftool/dbconfig/20210414-093642-root.json	[production]
09:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'Pool db1177 with minimal weight on s8 for the first time T275633', diff saved to https://phabricator.wikimedia.org/P15316 and previous config saved to /var/cache/conftool/dbconfig/20210414-093305-marostegui.json	[production]
09:29	<gehel>	depooling wdqs1004 - corrupted data after data reload	[production]
09:27	<effie>	disable puppet on all mediawiki servers to merge 676580	[production]
09:24	<urbanecm@deploy1002>	Synchronized php-1.37.0-wmf.1/extensions/DiscussionTools/includes/Hooks/HookUtils.php: e4b2d93dcf86a336314ed09fd37844edb16f4f30: Dont allow query and cookie hacks to enable topic subscriptions (T280082) (duration: 01m 24s)	[production]
09:23	<gehel>	repooling wdqs1013, catched up on lag	[production]
09:22	<gehel>	depooling wdqs1003 - corrupted data after data reload	[production]
09:19	<jmm@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kraz.wikimedia.org	[production]
09:16	<gehel>	restarting blazegraph on wdqs1003	[production]
09:12	<ryankemper>	T267927 depooled `wdqs1004` following data transfer (catching up on lag), current round of data transfers is done so there shouldn't be any left to depool	[production]
09:10	<ryankemper@cumin2001>	END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)	[production]
09:09	<jmm@cumin1001>	START - Cookbook sre.hosts.decommission for hosts kraz.wikimedia.org	[production]
09:06	<ryankemper>	T267927 depool `wdqs2001` following data transfer (catching up on lag)	[production]
09:03	<jmm@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast1002.wikimedia.org	[production]
09:03	<ryankemper@cumin2001>	END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)	[production]
08:53	<jmm@cumin1001>	START - Cookbook sre.hosts.decommission for hosts bast1002.wikimedia.org	[production]
08:44	<Urbanecm>	Run scap pull on mwdebug1002	[production]
08:40	<Urbanecm>	Stagging on mwdebug1002	[production]
08:20	<akosiaris@cumin1001>	conftool action : set/weight=10; selector: cluster=videoscaler,service=apache2,name=mw2394.codfw.wmnet	[production]
08:20	<akosiaris@cumin1001>	conftool action : set/weight=10; selector: cluster=videoscaler,service=apache2,name=mw2395.codfw.wmnet	[production]
08:16	<jiji@cumin1001>	conftool action : set/pooled=yes; selector: name=(wtp1033.eqiad.wmnet\|wtp1032.eqiad.wmnet)	[production]
08:07	<jayme>	updated chartmuseum to 0.13.1 on charmuseum1001, chartmuseum2001	[production]
08:06	<jayme@cumin1001>	conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad	[production]
08:05	<gehel>	depooling wdqs2004 - catching up on lag	[production]
08:01	<ryankemper@cumin2001>	END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)	[production]
07:59	<gehel>	depooling wdqs2001 - catching up on lag	[production]
07:57	<gehel>	depooling wdqs1013 - catching up on lag	[production]
07:56	<ryankemper@cumin2001>	END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)	[production]
07:55	<gehel>	restarting blazegraph + updater on wdqs1013	[production]
07:51	<jayme@cumin1001>	conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad	[production]
07:51	<jayme@cumin1001>	conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=codfw	[production]
07:42	<jayme@cumin1001>	conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=codfw	[production]