3201-3250 of 10000 results (31ms)
2020-11-16 §
11:13 <moritzm> installing poppler security updates [production]
10:46 <klausman@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
10:46 <klausman@cumin1001> START - Cookbook sre.hosts.downtime [production]
10:45 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
10:45 <dcaro@cumin1001> START - Cookbook sre.hosts.downtime [production]
10:44 <dcaro@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
10:44 <dcaro@cumin1001> START - Cookbook sre.hosts.downtime [production]
09:31 <gehel@cumin2001> END (FAIL) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=99) [production]
09:31 <gehel@cumin2001> START - Cookbook sre.elasticsearch.force-shard-allocation [production]
08:39 <godog> centrallog1001 move invalid config /etc/logrotate.d/logrotate-debug to /etc [production]
08:35 <moritzm> installing codemirror-js security updates [production]
08:32 <XioNoX> asw-c-codfw> request system power-off member 7 - T267865 [production]
08:24 <joal@deploy1001> Finished deploy [analytics/refinery@3df51cb] (thin): Analytics special train for webrequest table update THIN [analytics/refinery@3df51cb] (duration: 00m 07s) [production]
08:23 <joal@deploy1001> Started deploy [analytics/refinery@3df51cb] (thin): Analytics special train for webrequest table update THIN [analytics/refinery@3df51cb] [production]
08:23 <joal@deploy1001> Finished deploy [analytics/refinery@3df51cb]: Analytics special train for webrequest table update [analytics/refinery@3df51cb] (duration: 10m 09s) [production]
08:13 <joal@deploy1001> Started deploy [analytics/refinery@3df51cb]: Analytics special train for webrequest table update [analytics/refinery@3df51cb] [production]
08:08 <XioNoX> asw-c-codfw> request system power-off member 7 - T267865 [production]
06:35 <marostegui> Stop replication on s3 codfw master (db2105) for MCR schema change deployment T238966 [production]
06:14 <marostegui> Stop MySQL on es1018, es1015, es1019 to clone es1032, es1033, es1034 - T261717 [production]
06:06 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1018, es1015, es1019 - T261717', diff saved to https://phabricator.wikimedia.org/P13262 and previous config saved to /var/cache/conftool/dbconfig/20201116-060624-marostegui.json [production]
06:02 <marostegui> Restart mysql on db1115 (tendril/dbtree) due to memory usage [production]
00:55 <shdubsh> re-applied mask to kafka and kafka-mirror-main-eqiad_to_main-codfw@0 on kafka-main2003 and disabled puppet to prevent restart - T267865 [production]
00:19 <elukey> run 'systemctl mask kafka' and 'systemctl mask kafka-mirror-main-eqiad_to_main-codfw@0' on kafka-main2003 (for the brief moment when it was up) to avoid purged issues - T267865 [production]
00:09 <elukey> sudo cumin 'cp2028* or cp2036* or cp2039* or cp4022* or cp4025* or cp4028* or cp4031*' 'systemctl restart purged' -b 3 - T267865 [production]
2020-11-15 §
22:10 <cdanis> restart some purgeds in ulsfo as well T267865 T267867 [production]
22:03 <cdanis> T267867 T267865 ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕔🍺 sudo cumin -b2 -s10 'A:cp and A:codfw' 'systemctl restart purged' [production]
14:00 <cdanis> powercycling ms-be1022 via mgmt [production]
11:21 <aborrero@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
11:21 <aborrero@cumin1001> START - Cookbook sre.hosts.downtime [production]
11:12 <vgutierrez> depooling lvs2007, lvs2010 taking over text traffic on codfw - T267865 [production]
10:00 <elukey> cumin 'cp2042* or cp2036* or cp2039*' 'systemctl restart purged' -b 1 [production]
09:57 <elukey> restart purged on cp4028 (consumer stuck due to kafka-main2003 down) [production]
09:55 <elukey> restart purged on cp4025 (consumer stuck due to kafka-main2003 down) [production]
09:53 <elukey> restart purged on cp4031 (consumer stuck due to kafka-main2003 down) [production]
09:50 <elukey> restart purged on cp4022 (consumer stuck due to kafka-main2003 down) [production]
09:42 <elukey> restart purged on cp2028 (kafka-main2003 is down and there are connect timeouts errors) [production]
09:07 <Urbanecm> Change email for SUL user Botopol via resetUserEmail.php (T267866) [production]
08:27 <elukey> truncate -s 10g /var/lib/hadoop/data/n/yarn/logs/application_1601916545561_173219/container_e25_1601916545561_173219_01_000177/stderr on an-worker1100 [production]
08:24 <elukey> sudo truncate -s 10g /var/lib/hadoop/data/c/yarn/logs/application_1601916545561_173219/container_e25_1601916545561_173219_01_000019/stderr on an-worker1098 [production]
2020-11-13 §
22:06 <Urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=myvwiki autopatrolled # T105570 [production]
22:04 <Urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=testwiki editor # T105570 [production]
21:42 <Urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=enwikinews reviewer # T105570 [production]
21:40 <Urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=bnwiki editor # T105570 [production]
21:39 <Urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=testwiki flood # T105570 [production]
21:38 <Urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=test2wiki upwizcampeditors # T105570 [production]
21:33 <Urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=aawiki communityapplica # T105570 [production]
21:28 <Urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=enwiki epadmin # T105570 [production]
16:50 <_joe_> manually rotate user.log on centrallog1001 and moved it to /srv/user.log.manual-rotation [production]
15:31 <ejegg|away> updated fundraising CiviCRM from f7954c6659 to 74d795408f [production]
08:15 <vgutierrez> restart acme-chief on acmechief1001 [production]