| 
      
        2020-05-28
      
      §
     | 
  
    
  | 14:01 | 
  <ema> | 
  atskafka 0.8 uploaded to buster-wikimedia T253551 | 
  [production] | 
            
  | 13:49 | 
  <godog> | 
  roll-restart prometheus k8s-staging to enable thanos upload - T252186 | 
  [production] | 
            
  | 13:36 | 
  <hashar> | 
  Restarting CI Jenkins for plugin rollback | 
  [production] | 
            
  | 11:49 | 
  <moritzm> | 
  installing unbound security updates | 
  [production] | 
            
  | 11:03 | 
  <kormat@cumin1001> | 
  dbctl commit (dc=all): 'Add db2138 to s2+s4 T252985', diff saved to https://phabricator.wikimedia.org/P11330 and previous config saved to /var/cache/conftool/dbconfig/20200528-110333-kormat.json | 
  [production] | 
            
  | 10:36 | 
  <jayme@deploy1001> | 
  helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' . | 
  [production] | 
            
  | 10:34 | 
  <jayme@deploy1001> | 
  helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' . | 
  [production] | 
            
  | 10:30 | 
  <jayme@deploy1001> | 
  helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . | 
  [production] | 
            
  | 10:02 | 
  <mutante> | 
  gerrit1002 (test server) - chown -R gerrit2:gerrit2 /var/lib/gerrit/review_site ; restarted gerrit service, now the service is not in restart loop anymore, gerrit-ssh is listening too, just not accepting publickey (T239151) | 
  [production] | 
            
  | 09:51 | 
  <XioNoX> | 
  failover VRRP in ulsfo | 
  [production] | 
            
  | 09:41 | 
  <XioNoX> | 
  re-activate peering/transit on cr2-eqdfw - T243080 | 
  [production] | 
            
  | 09:35 | 
  <mutante> | 
  restarting gerrit on gerrit1002 after fixing db_pass to the readonly one (T243800) | 
  [production] | 
            
  | 09:33 | 
  <XioNoX> | 
  restart cr2-eqdfw for upgrade - T243080 | 
  [production] | 
            
  | 09:30 | 
  <XioNoX> | 
  deactivate peering/transit on cr2-eqdfw - T243080 | 
  [production] | 
            
  | 09:25 | 
  <_joe_> | 
  updating ACLs on all etcd servers | 
  [production] | 
            
  | 09:22 | 
  <XioNoX> | 
  install new Junos on cr2-eqdfw - T243080 | 
  [production] | 
            
  | 09:16 | 
  <XioNoX> | 
  rollback cr2-eqord ospf/bgp - T243080 | 
  [production] | 
            
  | 09:07 | 
  <XioNoX> | 
  restart cr2-eqord for upgrade - T243080 | 
  [production] | 
            
  | 09:05 | 
  <jayme@deploy1001> | 
  helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . | 
  [production] | 
            
  | 08:50 | 
  <_joe_> | 
  upgrading etcd ACLs (adding new users) to conf1004 | 
  [production] | 
            
  | 08:50 | 
  <XioNoX> | 
  install new Junos on cr2-eqord - T243080 | 
  [production] | 
            
  | 08:46 | 
  <XioNoX> | 
  deactivate peering/transit on cr2-eqord - T243080 | 
  [production] | 
            
  | 08:45 | 
  <XioNoX> | 
  de-pref all OSPF links to cr2-eqord - T243080 | 
  [production] | 
            
  | 08:13 | 
  <marostegui> | 
  Pool db1141 into labsdb analytics role - T249188 | 
  [production] | 
            
  | 07:33 | 
  <gilles@deploy1001> | 
  Synchronized static/images: T252108 Deploying optimised static PNGs (duration: 01m 39s) | 
  [production] | 
            
  | 07:31 | 
  <gilles@deploy1001> | 
  Synchronized static/apple-touch: T252108 Deploying optimised static PNGs (duration: 01m 12s) | 
  [production] | 
            
  | 06:30 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Remove db1081 from API and set its weight to 0 on main traffic - preparation for tomorrow's failover T253808', diff saved to https://phabricator.wikimedia.org/P11329 and previous config saved to /var/cache/conftool/dbconfig/20200528-063037-marostegui.json | 
  [production] | 
            
  | 04:44 | 
  <marostegui> | 
  Run check_private data on db1141 - T249188 | 
  [production] | 
            
  | 04:22 | 
  <marostegui> | 
  Stop MySQL on db1141 - T249188 | 
  [production] | 
            
  
    | 
      
        2020-05-27
      
      §
     | 
  
    
  | 23:20 | 
  <catrope@deploy1001> | 
  Synchronized wmf-config/InitialiseSettings.php: Add autoreviewrestore right to rollbacker group on hiwiki (T252986) (duration: 01m 05s) | 
  [production] | 
            
  | 23:16 | 
  <catrope@deploy1001> | 
  Synchronized wmf-config/InitialiseSettings.php: Add thwiki Draft namespace to wmgExemptFromUserRobotsControlExtra and enable VE there (T252959) (duration: 01m 06s) | 
  [production] | 
            
  | 22:58 | 
  <gehel@cumin1001> | 
  END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0) | 
  [production] | 
            
  | 22:02 | 
  <crusnov@deploy1001> | 
  Finished deploy [netbox/deploy@5251cf1]: Netbox Upgrade to 2.8.4 (part4) (duration: 00m 10s) | 
  [production] | 
            
  | 22:02 | 
  <crusnov@deploy1001> | 
  Started deploy [netbox/deploy@5251cf1]: Netbox Upgrade to 2.8.4 (part4) | 
  [production] | 
            
  | 22:01 | 
  <crusnov@deploy1001> | 
  Finished deploy [netbox/deploy@5251cf1]: Netbox Upgrade to 2.8.4 (part3) (duration: 01m 29s) | 
  [production] | 
            
  | 22:00 | 
  <crusnov@deploy1001> | 
  Started deploy [netbox/deploy@5251cf1]: Netbox Upgrade to 2.8.4 (part3) | 
  [production] | 
            
  | 22:00 | 
  <crusnov@deploy1001> | 
  deploy aborted: Netbox Upgrade to 2.8.4 (part2) (duration: 01m 31s) | 
  [production] | 
            
  | 21:58 | 
  <crusnov@deploy1001> | 
  Started deploy [netbox/deploy@5251cf1]: Netbox Upgrade to 2.8.4 (part2) | 
  [production] | 
            
  | 21:58 | 
  <crusnov@deploy1001> | 
  Finished deploy [netbox/deploy@5251cf1]: Netbox Upgrade to 2.8.1 (part1) (duration: 01m 01s) | 
  [production] | 
            
  | 21:57 | 
  <crusnov@deploy1001> | 
  Started deploy [netbox/deploy@5251cf1]: Netbox Upgrade to 2.8.1 (part1) | 
  [production] | 
            
  | 20:43 | 
  <gehel@cumin1001> | 
  START - Cookbook sre.postgresql.postgres-init | 
  [production] | 
            
  | 20:28 | 
  <marostegui> | 
  Decrease  innodb poolsize on s4 master and restart mysql | 
  [production] | 
            
  | 20:11 | 
  <mbsantos@deploy1001> | 
  Finished deploy [mobileapps/deploy@9dc827f]: Update mobileapps to b3b9214c (T253648) (duration: 03m 31s) | 
  [production] | 
            
  | 20:08 | 
  <mbsantos@deploy1001> | 
  Started deploy [mobileapps/deploy@9dc827f]: Update mobileapps to b3b9214c (T253648) | 
  [production] | 
            
  | 20:04 | 
  <twentyafterfour@deploy1001> | 
  Synchronized php: group1 wikis to 1.35.0-wmf.32  refs T253022 (duration: 01m 04s) | 
  [production] | 
            
  | 20:03 | 
  <twentyafterfour@deploy1001> | 
  rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.32  refs T253022 | 
  [production] | 
            
  | 20:00 | 
  <gehel@cumin1001> | 
  END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0) | 
  [production] | 
            
  | 19:56 | 
  <twentyafterfour@deploy1001> | 
  scap failed: average error rate on 4/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/e474f13ffac6b8c3bf919c4aeafc8c9b for details) | 
  [production] | 
            
  | 19:46 | 
  <jforrester@deploy1001> | 
  Synchronized php-1.35.0-wmf.34/includes/parser/CoreParserFunctions.php: T253725 Partially revert 'Fix impedance mismatch with Parser::getRevisionRecordObject()' (duration: 01m 05s) | 
  [production] | 
            
  | 19:12 | 
  <joal@deploy1001> | 
  Finished deploy [analytics/refinery@8a3dcb3]: Analytics regular weekly train (an-launcher1001 only) [8a3dcb3] (duration: 06m 07s) | 
  [production] |