| 2019-04-18
      
      ยง | 
    
  | 21:18 | <thcipriani@deploy1001> | Started deploy [gerrit/gerrit@e3c340f]: plugin update -- no restart needed (cobalt) | [production] | 
            
  | 21:17 | <thcipriani@deploy1001> | Finished deploy [gerrit/gerrit@e3c340f]: plugin update -- no restart needed (gerrit2001) (duration: 00m 11s) | [production] | 
            
  | 21:17 | <thcipriani@deploy1001> | Started deploy [gerrit/gerrit@e3c340f]: plugin update -- no restart needed (gerrit2001) | [production] | 
            
  | 21:14 | <robh@cumin1001> | END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) | [production] | 
            
  | 21:14 | <robh@cumin1001> | START - Cookbook sre.hosts.decommission | [production] | 
            
  | 20:56 | <twentyafterfour@deploy1001> | rebuilt and synchronized wikiversions files: all wikis to 1.34.0-wmf.1  refs T220726 | [production] | 
            
  | 20:52 | <cdanis> | root@icinga1001.wikimedia.org /var/lib/icinga # for DOWNTIME in $(fgrep -B12 'comment=mobrovac: temp stop JQ for T221368 - cdanis@cumin1001' retention.dat | grep -A13 servicedowntime | grep downtime_id | cut -d= -f2); do  printf "[%lu] DEL_SVC_DOWNTIME;%u\n" $(date +%s) $DOWNTIME ; done > rw/icinga.cmd | [production] | 
            
  | 20:40 | <mobrovac@deploy1001> | Synchronized php-1.34.0-wmf.1/extensions/Translate/utils/MessageUpdateJob.php: Translate jobs: Remove problematic Job::$params assignments, dir 2/2 - T221368 (duration: 01m 00s) | [production] | 
            
  | 20:38 | <mobrovac@deploy1001> | Synchronized php-1.34.0-wmf.1/extensions/Translate/tag: Translate jobs: Remove problematic Job::$params assignments, dir 1/2 - T221368 (duration: 01m 01s) | [production] | 
            
  | 20:32 | <cdanis> | cdanis@cumin1001.eqiad.wmnet ~ % sudo cumin 'scb*' 'enable-puppet "mobrovac: temp stop JQ for T221368"' | [production] | 
            
  | 20:31 | <mobrovac@deploy1001> | Finished deploy [cpjobqueue/deploy@71941b1]: Ignore Kafka disconnect errors (duration: 00m 51s) | [production] | 
            
  | 20:30 | <mobrovac@deploy1001> | Started deploy [cpjobqueue/deploy@71941b1]: Ignore Kafka disconnect errors | [production] | 
            
  | 19:36 | <cdanis> | cdanis@cumin1001.eqiad.wmnet ~ % sudo cookbook sre.hosts.downtime -r "mobrovac: temp stop JQ for T221368" 'scb*' | [production] | 
            
  | 19:36 | <cdanis@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 19:36 | <cdanis@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 19:29 | <cdanis> | cdanis@cumin1001.eqiad.wmnet ~ % sudo cumin 'scb*' 'disable-puppet "mobrovac: temp stop JQ for T221368" && systemctl stop cpjobqueue' | [production] | 
            
  | 19:17 | <mobrovac@deploy1001> | Started restart [cpjobqueue/deploy@922cbc0]: Bounce CP4JQ, lots of transport broken failures - T221368 | [production] | 
            
  | 19:11 | <mobrovac@deploy1001> | Synchronized php-1.34.0-wmf.1/extensions/EventBus/includes/EventFactory.php: Remove the use of page titles in JobExecutor, file 2/2 - T221368 (duration: 00m 59s) | [production] | 
            
  | 19:10 | <mobrovac@deploy1001> | Synchronized php-1.34.0-wmf.1/extensions/EventBus/includes/JobExecutor.php: Remove the use of page titles in JobExecutor, file 1/2 - T221368 (duration: 01m 01s) | [production] | 
            
  | 18:47 | <robh@cumin1001> | END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) | [production] | 
            
  | 18:47 | <robh@cumin1001> | START - Cookbook sre.hosts.decommission | [production] | 
            
  | 18:47 | <robh@cumin1001> | END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) | [production] | 
            
  | 18:47 | <robh@cumin1001> | START - Cookbook sre.hosts.decommission | [production] | 
            
  | 18:41 | <mutante> | mw2150 - reimaging, not in confctl | [production] | 
            
  | 18:02 | <dzahn@puppetmaster1001> | conftool action : set/pooled=yes; selector: name=mw2151.codfw.wmnet,cluster=jobrunner,service=nginx | [production] | 
            
  | 17:49 | <mutante> | mw2151 - scap pull | [production] | 
            
  | 17:46 | <mobrovac@deploy1001> | Synchronized php-1.34.0-wmf.1/extensions/EventBus/includes/JobExecutor.php: Default to a dummy title for invalid titles - T221368 (duration: 01m 01s) | [production] | 
            
  | 17:20 | <twentyafterfour@deploy1001> | Synchronized php-1.34.0-wmf.1/extensions/AbuseFilter/includes/: sync https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AbuseFilter/+/504863 (duration: 01m 00s) | [production] | 
            
  | 16:20 | <bblack> | Experimental DNS-level changes deploying for wikipedia.org domain - if wikipedia.org DNS problems appear, revert https://gerrit.wikimedia.org/r/c/operations/dns/+/504588 - T208263 | [production] | 
            
  | 16:17 | <XioNoX> | remove peering to 63199 in eqsin (down for 1 month, no reply to emails) | [production] | 
            
  | 16:13 | <XioNoX> | rollback dhcp option 82 test from asw2-b-eqiad | [production] | 
            
  | 14:55 | <fsero> | synchronizing docker_registry_codfw swift container from docker_registry | [production] | 
            
  | 14:40 | <XioNoX> | push firewall change to pfw3-eqiad - T221278 | [production] | 
            
  | 13:30 | <jbond42> | rolling updates of ruby2.1 on jessie | [production] | 
            
  | 13:08 | <elukey> | roll restart of cassandra on aqs* to pick up new openjdk upgrades | [production] | 
            
  | 13:05 | <jmm@cumin2001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 13:05 | <jmm@cumin2001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 12:58 | <reedy@deploy1001> | rebuilt and synchronized wikiversions files: group1 back to .25 | [production] | 
            
  | 12:36 | <anomie> | Ran `php7adm /opcache-free` on mw1274 to test a theory related to T221347. The log entries related to that task stopped immediately. | [production] | 
            
  | 12:30 | <gehel> | restarting blazegraph + updater on wdqs* for jvm upgrade | [production] | 
            
  | 12:22 | <moritzm> | installing Java security updates on restbase-dev hosts (along with Cassandra restarts) | [production] | 
            
  | 12:21 | <gehel> | restarting blazegraph + updater on wdqs1009 / wdqs1010 for jvm upgrade | [production] | 
            
  | 12:19 | <moritzm> | installing Java security updates on WDQS autodeploy/test hosts | [production] | 
            
  | 10:40 | <jmm@cumin2001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 10:40 | <jmm@cumin2001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 10:35 | <moritzm> | installing rails security updates on jessie hosts | [production] | 
            
  | 10:21 | <moritzm> | installing jasper updates on jessie hosts | [production] | 
            
  | 09:44 | <akosiaris> | update grafana service/ dashboard to have user, system, throttled CPU metrics under the CPU saturation row | [production] | 
            
  | 09:41 | <gilles@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: T216597 Run CPU benchmark for all samples on eswiki/ruwiki (duration: 01m 06s) | [production] | 
            
  | 09:11 | <jmm@cumin2001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] |