production SAL

6301-6350 of 10000 results (80ms)

2022-08-02 §
12:59	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ganeti2013.codfw.wmnet with reason: Remove node for eventual reimage, T311686	[production]
12:48	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P32151 and previous config saved to /var/cache/conftool/dbconfig/20220802-124845-marostegui.json	[production]
12:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P32150 and previous config saved to /var/cache/conftool/dbconfig/20220802-123338-marostegui.json	[production]
12:18	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1180 (T312972)', diff saved to https://phabricator.wikimedia.org/P32149 and previous config saved to /var/cache/conftool/dbconfig/20220802-121832-marostegui.json	[production]
12:16	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1180 (T312972)', diff saved to https://phabricator.wikimedia.org/P32148 and previous config saved to /var/cache/conftool/dbconfig/20220802-121624-marostegui.json	[production]
12:16	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance	[production]
12:15	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance	[production]
12:13	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
12:12	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
12:12	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
12:11	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
12:01	<marostegui>	dbmaint x1@eqiad T314087	[production]
11:57	<marostegui>	dbmaint s7@eqiad T314377	[production]
11:57	<marostegui>	dbmaint s3@eqiad T314377	[production]
11:57	<marostegui>	dbmaint s8@eqiad T314377	[production]
11:54	<marostegui>	dbmait s8@eqiad T314377	[production]
11:54	<marostegui>	dbmait s3@eqiad T314377	[production]
11:50	<elukey@deploy1002>	helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .	[production]
11:48	<marostegui>	dbmait s7@eqiad T314377	[production]
11:46	<marostegui>	dbmait s4@eqiad T314377	[production]
11:35	<elukey>	restart rsyslog on ml-serve1006	[production]
10:50	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-worker1082.eqiad.wmnet with reason: T312626 btullis	[production]
10:50	<btullis@cumin1001>	START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on an-worker1082.eqiad.wmnet with reason: T312626 btullis	[production]
10:49	<godog>	grow sda3 by 100G on thanos-be2004 - T314275	[production]
10:42	<btullis@puppetmaster1001>	conftool action : set/pooled=inactive; selector: cluster=wikireplicas-b,name=dbproxy1018.eqiad.wmnet	[production]
10:42	<btullis@puppetmaster1001>	conftool action : set/pooled=yes; selector: cluster=wikireplicas-b,name=dbproxy1019.eqiad.wmnet	[production]
10:35	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
10:34	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
10:34	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
10:34	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
10:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1181 (re)pooling @ 100%: After restart', diff saved to https://phabricator.wikimedia.org/P32147 and previous config saved to /var/cache/conftool/dbconfig/20220802-103318-root.json	[production]
10:18	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1181 (re)pooling @ 75%: After restart', diff saved to https://phabricator.wikimedia.org/P32146 and previous config saved to /var/cache/conftool/dbconfig/20220802-101813-root.json	[production]
10:15	<marostegui@cumin1001>	dbctl commit (dc=all): 'Add db2175 to s2 T311494', diff saved to https://phabricator.wikimedia.org/P32145 and previous config saved to /var/cache/conftool/dbconfig/20220802-101522-marostegui.json	[production]
10:12	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1019.eqiad.wmnet with OS bullseye	[production]
10:05	<jynus>	shutdown dbprov2002 backup2005 backup2008 T310070	[production]
10:03	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1181 (re)pooling @ 50%: After restart', diff saved to https://phabricator.wikimedia.org/P32144 and previous config saved to /var/cache/conftool/dbconfig/20220802-100308-root.json	[production]
10:03	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32143 and previous config saved to /var/cache/conftool/dbconfig/20220802-100304-root.json	[production]
09:54	<marostegui@cumin1001>	dbctl commit (dc=all): 'Remove db2079 from dbctl T313885', diff saved to https://phabricator.wikimedia.org/P32141 and previous config saved to /var/cache/conftool/dbconfig/20220802-095455-marostegui.json	[production]
09:52	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbproxy1019.eqiad.wmnet with reason: host reimage	[production]
09:49	<btullis@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on dbproxy1019.eqiad.wmnet with reason: host reimage	[production]
09:49	<btullis@cumin1001>	END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.	[production]
09:48	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: After restart', diff saved to https://phabricator.wikimedia.org/P32140 and previous config saved to /var/cache/conftool/dbconfig/20220802-094804-root.json	[production]
09:47	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32139 and previous config saved to /var/cache/conftool/dbconfig/20220802-094759-root.json	[production]
09:44	<godog>	grow sdb3 by 100G on thanos-be2004 - T314275	[production]
09:43	<btullis@cumin1001>	START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.	[production]
09:42	<btullis@cumin1001>	END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.	[production]
09:37	<btullis@cumin1001>	START - Cookbook sre.hosts.reimage for host dbproxy1019.eqiad.wmnet with OS bullseye	[production]
09:36	<btullis@cumin1001>	START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.	[production]
09:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: After restart', diff saved to https://phabricator.wikimedia.org/P32138 and previous config saved to /var/cache/conftool/dbconfig/20220802-093259-root.json	[production]
09:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32137 and previous config saved to /var/cache/conftool/dbconfig/20220802-093254-root.json	[production]