4901-4950 of 10000 results (71ms)
2019-04-18 ยง
21:14 <robh@cumin1001> START - Cookbook sre.hosts.decommission [production]
20:56 <twentyafterfour@deploy1001> rebuilt and synchronized wikiversions files: all wikis to 1.34.0-wmf.1 refs T220726 [production]
20:52 <cdanis> root@icinga1001.wikimedia.org /var/lib/icinga # for DOWNTIME in $(fgrep -B12 'comment=mobrovac: temp stop JQ for T221368 - cdanis@cumin1001' retention.dat | grep -A13 servicedowntime | grep downtime_id | cut -d= -f2); do printf "[%lu] DEL_SVC_DOWNTIME;%u\n" $(date +%s) $DOWNTIME ; done > rw/icinga.cmd [production]
20:40 <mobrovac@deploy1001> Synchronized php-1.34.0-wmf.1/extensions/Translate/utils/MessageUpdateJob.php: Translate jobs: Remove problematic Job::$params assignments, dir 2/2 - T221368 (duration: 01m 00s) [production]
20:38 <mobrovac@deploy1001> Synchronized php-1.34.0-wmf.1/extensions/Translate/tag: Translate jobs: Remove problematic Job::$params assignments, dir 1/2 - T221368 (duration: 01m 01s) [production]
20:32 <cdanis> cdanis@cumin1001.eqiad.wmnet ~ % sudo cumin 'scb*' 'enable-puppet "mobrovac: temp stop JQ for T221368"' [production]
20:31 <mobrovac@deploy1001> Finished deploy [cpjobqueue/deploy@71941b1]: Ignore Kafka disconnect errors (duration: 00m 51s) [production]
20:30 <mobrovac@deploy1001> Started deploy [cpjobqueue/deploy@71941b1]: Ignore Kafka disconnect errors [production]
19:36 <cdanis> cdanis@cumin1001.eqiad.wmnet ~ % sudo cookbook sre.hosts.downtime -r "mobrovac: temp stop JQ for T221368" 'scb*' [production]
19:36 <cdanis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
19:36 <cdanis@cumin1001> START - Cookbook sre.hosts.downtime [production]
19:29 <cdanis> cdanis@cumin1001.eqiad.wmnet ~ % sudo cumin 'scb*' 'disable-puppet "mobrovac: temp stop JQ for T221368" && systemctl stop cpjobqueue' [production]
19:17 <mobrovac@deploy1001> Started restart [cpjobqueue/deploy@922cbc0]: Bounce CP4JQ, lots of transport broken failures - T221368 [production]
19:11 <mobrovac@deploy1001> Synchronized php-1.34.0-wmf.1/extensions/EventBus/includes/EventFactory.php: Remove the use of page titles in JobExecutor, file 2/2 - T221368 (duration: 00m 59s) [production]
19:10 <mobrovac@deploy1001> Synchronized php-1.34.0-wmf.1/extensions/EventBus/includes/JobExecutor.php: Remove the use of page titles in JobExecutor, file 1/2 - T221368 (duration: 01m 01s) [production]
18:47 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
18:47 <robh@cumin1001> START - Cookbook sre.hosts.decommission [production]
18:47 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
18:47 <robh@cumin1001> START - Cookbook sre.hosts.decommission [production]
18:41 <mutante> mw2150 - reimaging, not in confctl [production]
18:02 <dzahn@puppetmaster1001> conftool action : set/pooled=yes; selector: name=mw2151.codfw.wmnet,cluster=jobrunner,service=nginx [production]
17:49 <mutante> mw2151 - scap pull [production]
17:46 <mobrovac@deploy1001> Synchronized php-1.34.0-wmf.1/extensions/EventBus/includes/JobExecutor.php: Default to a dummy title for invalid titles - T221368 (duration: 01m 01s) [production]
17:20 <twentyafterfour@deploy1001> Synchronized php-1.34.0-wmf.1/extensions/AbuseFilter/includes/: sync https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AbuseFilter/+/504863 (duration: 01m 00s) [production]
16:20 <bblack> Experimental DNS-level changes deploying for wikipedia.org domain - if wikipedia.org DNS problems appear, revert https://gerrit.wikimedia.org/r/c/operations/dns/+/504588 - T208263 [production]
16:17 <XioNoX> remove peering to 63199 in eqsin (down for 1 month, no reply to emails) [production]
16:13 <XioNoX> rollback dhcp option 82 test from asw2-b-eqiad [production]
14:55 <fsero> synchronizing docker_registry_codfw swift container from docker_registry [production]
14:40 <XioNoX> push firewall change to pfw3-eqiad - T221278 [production]
13:30 <jbond42> rolling updates of ruby2.1 on jessie [production]
13:08 <elukey> roll restart of cassandra on aqs* to pick up new openjdk upgrades [production]
13:05 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:05 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
12:58 <reedy@deploy1001> rebuilt and synchronized wikiversions files: group1 back to .25 [production]
12:36 <anomie> Ran `php7adm /opcache-free` on mw1274 to test a theory related to T221347. The log entries related to that task stopped immediately. [production]
12:30 <gehel> restarting blazegraph + updater on wdqs* for jvm upgrade [production]
12:22 <moritzm> installing Java security updates on restbase-dev hosts (along with Cassandra restarts) [production]
12:21 <gehel> restarting blazegraph + updater on wdqs1009 / wdqs1010 for jvm upgrade [production]
12:19 <moritzm> installing Java security updates on WDQS autodeploy/test hosts [production]
10:40 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
10:40 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
10:35 <moritzm> installing rails security updates on jessie hosts [production]
10:21 <moritzm> installing jasper updates on jessie hosts [production]
09:44 <akosiaris> update grafana service/ dashboard to have user, system, throttled CPU metrics under the CPU saturation row [production]
09:41 <gilles@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T216597 Run CPU benchmark for all samples on eswiki/ruwiki (duration: 01m 06s) [production]
09:11 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:10 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
09:00 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:00 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
08:54 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]