5401-5450 of 10000 results (57ms)
2019-06-06 §
10:30 <ema> varnish 5.1.3-1wm10 uploaded to stretch-wikimedia T224694 [production]
10:19 <elukey> rolling restart of mcrouter on mw1* hosts to pick up config change (batch of 5 hosts, depool/run-puppet/pool) [production]
10:12 <elukey> disable puppet on mw1* and mw[2163,2235,2255,2271] as prep step for mcrouter config deploy [production]
10:11 <Lucas_WMDE> wikidata-new-wbterm update core to dfe30d5118 [wikidata-dev]
10:10 <fsero> rollbacked last deployment of mathoid to revision 16 [production]
10:08 <Lucas_WMDE> wikidata-new-wbterm sudo apt install zip unzip # needed for composer update [wikidata-dev]
09:59 <mobrovac@deploy1001> scap-helm mathoid finished [production]
09:59 <mobrovac@deploy1001> scap-helm mathoid cluster codfw completed [production]
09:59 <mobrovac@deploy1001> scap-helm mathoid cluster eqiad completed [production]
09:59 <mobrovac@deploy1001> scap-helm mathoid upgrade production stable/mathoid -f mathoid-values.yaml [namespace: mathoid, clusters: eqiad,codfw] [production]
09:52 <elukey> chown report updater output dirs on stat1007 to analytics:wikidev (was hdfs:wikidev) to unblock creation of new data [analytics]
09:45 <elukey> re-run refine_sanitize_eventlogging_analytics_immediate with since = 900 in the .properties file [analytics]
09:31 <moritzm> rebooting mwdebug2002 for some tests [production]
09:31 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:30 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
09:28 <moritzm> updating qemu on ganeti2004 for some tests [production]
09:24 <gehel@cumin2001> START - Cookbook sre.postgresql.postgres-init [production]
08:38 <marostegui> Stop MySQL on db1117:3322 - this will trigger haproxy alerts - T222682 [production]
08:13 <hashar> Reloading Zuul for I764972711843645afd00e196a3bedd17730b4cbe which drops mwselenium-quibble-docker from Wikibase [releng]
07:35 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Repool db1121 after upgrade T224852 (duration: 00m 53s) [production]
07:20 <marostegui> Stop MySQL on db1121 for upgrade, this will generate lag on labs hosts for s6 - T224852 [production]
07:16 <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Promote db2046 to s6 master as db2039 will be decommissioned T221533 (duration: 00m 55s) [production]
06:38 <elukey> re-run refine_sanitize_eventlogging_analytics_immediate with since = 48 in the .properties file (manually added) [analytics]
06:31 <marostegui> Start topology changes on s6 codfw to promote db2046 as master - T221533 [production]
06:23 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Depool db1121 for upgrade T224852 (duration: 00m 55s) [production]
06:15 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Fully repool db1091 after getting its BBU replaced (duration: 00m 54s) [production]
06:01 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: More traffic to db1091 after getting its BBU replaced (duration: 01m 01s) [production]
05:47 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: More traffic to db1091 after getting its BBU replaced (duration: 00m 55s) [production]
05:41 <marostegui> Upgrade MySQL on s6 codfw hosts in preparation for s6 codfw master failover - T221533 [production]
05:36 <elukey> chown analytics:analytics /wmf/data/event_sanitized/{CentralNoticeTiming,LayoutJank,EventTiming,ElementTiming} (new directories created with yarn:analytics) [analytics]
05:32 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: More traffic to db1091 after getting its BBU replaced (duration: 00m 55s) [production]
05:18 <marostegui> Remove db2042 from tendril and zarcillo T225090 [production]
05:18 <marostegui> Remove db2042 from tendril and zarcillo [production]
05:14 <marostegui> Stop MySQL on db2042 to copy its content to dbprov2001 as a temporary backup - T225090 [production]
05:11 <marostegui> Disable notifications db2042 - T225090 [production]
05:09 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Slowly repool db1091 after getting its BBU replaced T225060 (duration: 00m 56s) [production]
2019-06-05 §
22:50 <Krenair> Added cloudcontrol1004 IP to match cloudcontrol1003 rule in 'proxy' security group rules for port 5668 T225168 [project-proxy]
22:44 <Krenair> Updating 'proxy' security group rules for port 5668 to remove decommissioned IP - 208.80.154.147 californium T189921 [project-proxy]
22:36 <Krenair> Updating 'proxy' security group rules for port 5668 to remove decommissioned IPs - 208.80.154.136 silver, 208.80.155.117 labs-ns0, 208.80.152.32 virt0 (!), 208.80.153.48 labtestservices2001, 208.80.154.92 labcontrol1001 [project-proxy]
22:15 <chaomodus> restarting gerrit on cobalt due to it being down (seems like Java out of heap space) [production]
22:14 <Krenair> Per jeh's investigation, added cloudservices1004 IP to match cloudservices1003 rule in 'proxy' security group rules for port 5668 [project-proxy]
20:59 <mforns> finished deployment of analytics/refinery up to 0660e70153dec892ae20bee7119a72cc17e8ec87 [analytics]
20:43 <mforns@deploy1001> Finished deploy [analytics/refinery@0660e70]: deploying analytics/refinery up to 0660e70153dec892ae20bee7119a72cc17e8ec87 (duration: 19m 30s) [production]
20:39 <reedy@deploy1001> Synchronized wmf-config/flaggedrevs.php: Turn off some FR config T225138 (duration: 00m 54s) [production]
20:25 <akosiaris@deploy1001> scap-helm blubberoid finished [production]
20:25 <akosiaris@deploy1001> scap-helm blubberoid cluster codfw completed [production]
20:25 <akosiaris@deploy1001> scap-helm blubberoid cluster eqiad completed [production]
20:25 <akosiaris@deploy1001> scap-helm blubberoid upgrade -f blubberoid-values.yaml production stable/blubberoid [namespace: blubberoid, clusters: eqiad,codfw] [production]
20:23 <mforns@deploy1001> Started deploy [analytics/refinery@0660e70]: deploying analytics/refinery up to 0660e70153dec892ae20bee7119a72cc17e8ec87 [production]
20:20 <mforns> starting deployment of analytics/refinery up to 0660e70153dec892ae20bee7119a72cc17e8ec87 [analytics]