1101-1150 of 10000 results (64ms)
2019-07-23 ยง
22:36 <mutante> rolling out scap 3.11.1-1 on mw-eqiad servers [production]
22:14 <mutante> continuing rollout of new scap version 3.11.1-1, starting with kafka-all followed by other cumin-alias groups (T228328) [production]
22:06 <herron> puppet temporarily disabled on eqiad/codfw logstash collectors while catching up with backlog. see /etc/logstash/conf.d/01-filter_temp_drops.conf [production]
21:52 <herron> logstash - temporarily dropping logs matching [message] =~ /^SlowTimer/ due to UTF-8 parsing errors that are stopping the logstash processing pipeline. will re-enable after logstash has caught up with the backlog [production]
20:59 <shdubsh> temporarily disable input-kafka-rsyslog-shipper and drop memcached logs on logstash nodes [production]
20:08 <paravoid> asw2-a-eqiad: request virtual-chassis vc-port set interface member 6 vcp-255/1/0 disable [production]
19:58 <eileen> process-control config revision is 4006d3bdc5 - disabled drush fill donor totals job [production]
19:49 <mutante> mwdebug1002 - restarting hhvm - mw1312 - restarted apache [production]
19:44 <andrewbogott> restarting rabbitmq-server on cloudcontrol1003 and 1004 [production]
19:40 <mutante> restarting hhvm on mw1312 [production]
19:28 <cdanis> depool all appservers in eqiad A7 cdanis@cumin1001.eqiad.wmnet ~ ๐Ÿต sudo cumin 'mw12[67-83]*' 'depool' [production]
19:11 <bblack> repool lvs1013 - T227143 [production]
19:10 <bblack> repool cp1077 + cp1078 - T227143 [production]
19:09 <elukey> depool mw1261 for investigation [production]
19:06 <herron> restarting logstash on logstash100[789] [production]
18:53 <robh> mw1271 had power loss event due to pdu swap via T227143 [production]
18:45 <mutante> rolling out scap 3.11.1-1 on all mw codfw servers (T228328) [production]
18:43 <mutante> rolling out scap 3.11.1-1 on mw canary servers (T228328) [production]
18:13 <robh> started depooling servers in a7-eqiad for pdu work via T227143 [production]
18:11 <cdanis> depool mw1267 [production]
18:10 <cdanis> cdanis@mw1267.eqiad.wmnet /srv/mediawiki โ˜• scap pull [production]
18:09 <cdanis> cdanis@mw1267.eqiad.wmnet ~ โ˜• sudo apt install python-concurrent.futures [production]
18:08 <jforrester@deploy1001> Synchronized php-1.34.0-wmf.15/includes/export/XmlDumpWriter.php: T228720 Make XmlDumpwriter resilient to blob store corruption (duration: 00m 54s) [production]
18:07 <James_F> Belay that, error on mw1267. [production]
18:06 <James_F> Sync error on mw1314.eqiad.wmnet, No module named concurrent.futures [production]
18:05 <jforrester@deploy1001> Synchronized php-1.34.0-wmf.14/includes/export/XmlDumpWriter.php: T228720 Make XmlDumpwriter resilient to blob store corruption (duration: 00m 57s) [production]
18:05 <bblack> lvs1013 - disable puppet and stop pybal - T227143 [production]
18:04 <bblack> depool cp1077 + cp1088 - T227143 [production]
18:03 <cdanis@deploy1001> Synchronized docroot/noc/db.php: 8def4af1d noc db.php: include readonly status & group loads (duration: 00m 55s) [production]
17:52 <moritzm> installing Java security updates on kafka/main and Logstash servers [production]
17:38 <ppchelko@deploy1001> Finished deploy [changeprop/deploy@6c5c0a3]: Switch internal events to the new schema T226522, step 2 (duration: 01m 37s) [production]
17:36 <ppchelko@deploy1001> Started deploy [changeprop/deploy@6c5c0a3]: Switch internal events to the new schema T226522, step 2 [production]
17:00 <ppchelko@deploy1001> Finished deploy [changeprop/deploy@894f735]: Switch internal events to the new schema T226522 (duration: 01m 30s) [production]
16:58 <ppchelko@deploy1001> Started deploy [changeprop/deploy@894f735]: Switch internal events to the new schema T226522 [production]
16:22 <godog> pool prometheus1003 - T227139 [production]
15:46 <robh> side b of a5-eqiad swapping pdu via T227141 [production]
15:14 <otto@> helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'main' . [production]
15:08 <_joe_> uninstalling php-pear, php-mail, php-mail-mime from mw1267 T195364 [production]
14:52 <ppchelko@deploy1001> Finished deploy [restbase/deploy@ea10fa5]: Switch event production to eventgate T211248, attempt 2 (duration: 13m 08s) [production]
14:39 <ppchelko@deploy1001> Started deploy [restbase/deploy@ea10fa5]: Switch event production to eventgate T211248, attempt 2 [production]
14:14 <robh> a3-eqiad pdu swap taking place now via T227139 [production]
13:47 <otto@> helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'main' . [production]
13:45 <godog> depool restbase1016 restbase1019 restbase1011 restbase1010 prometheus1003 ahead of PDU work - T227139 [production]
13:45 <otto@> helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'main' . [production]
13:44 <moritzm> installing Java security updates on furud/flerovium [production]
13:43 <otto@> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'main' . [production]
13:27 <jeh> dumps switching active vps to labstore1006 T224228 [production]
13:17 <liw@deploy1001> rebuilt and synchronized wikiversions files: group0 to 1.34.0-wmf.15 [production]
13:07 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:07 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]