| 2020-02-05
      
      § | 
    
  | 13:46 | <marostegui> | Decrease buffer pool size on db1107 for testing - T242702 | [production] | 
            
  | 13:45 | <vgutierrez@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 13:43 | <vgutierrez@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 13:42 | <akosiaris> | undo the manually set 10.2.1.42 eventgate-analytics.discovery.wmnet in /etc/hosts for mw1331, mw1348. Verify hypothesis that this should cause increased latency. Restart php-fpm | [production] | 
            
  | 13:41 | <ema> | cp1075: unset Accept-Encoding on origin server requests T242478 | [production] | 
            
  | 13:39 | <Amir1> | EU SWAT is done | [production] | 
            
  | 13:38 | <ema> | cp: disable puppet and merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/570311/ T242478 | [production] | 
            
  | 13:35 | <XioNoX> | rollback traffic steering off cr2-eqord | [production] | 
            
  | 13:29 | <akosiaris> | manually set 10.2.1.42 eventgate-analytics.discovery.wmnet in /etc/hosts for mw1331, mw1348. Verify hypothesis that this should cause increased latency | [production] | 
            
  | 13:25 | <XioNoX> | reboot cr2-eqord for software upgrade - yaaaaa | [production] | 
            
  | 13:24 | <ladsgroup@deploy1001> | Synchronized php-1.35.0-wmf.18/extensions/Wikibase/lib/includes/Store/CachingPropertyInfoLookup.php: SWAT: [[gerrit:570301|Cache PropertyInfoLookup internally]] (T243955) (duration: 01m 07s) | [production] | 
            
  | 13:17 | <XioNoX> | increase ospf cost for cr2-eqord links | [production] | 
            
  | 13:16 | <vgutierrez> | upload acme-chief 0.23 to apt.wm.o (buster) - T244236 | [production] | 
            
  | 13:15 | <XioNoX> | disable transit/peering BGP sessions on cr2-eqord | [production] | 
            
  | 13:15 | <ladsgroup@deploy1001> | Synchronized php-1.35.0-wmf.16/extensions/Wikibase/lib/includes/Store/CachingPropertyInfoLookup.php: SWAT: [[gerrit:570301|Cache PropertyInfoLookup internally]] (T243955) (duration: 01m 07s) | [production] | 
            
  | 13:10 | <XioNoX> | rollback: disable transit/peering BGP sessions on cr2-eqdfw | [production] | 
            
  | 13:08 | <vgutierrez> | depooling & reimaging cp5006 as buster - T242093 | [production] | 
            
  | 13:03 | <urbanecm@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: SWAT: 5cc2b70: wgLogoHD and $wgVectorPrintLogo is replaced with wgLogos (T232140) (duration: 01m 06s) | [production] | 
            
  | 13:01 | <XioNoX> | reboot cr2-eqdfw for software upgrade | [production] | 
            
  | 13:00 | <Amir1> | SWAT needs more time | [production] | 
            
  | 12:55 | <XioNoX> | disable transit/peering BGP sessions on cr2-eqdfw | [production] | 
            
  | 12:50 | <urbanecm@deploy1001> | Synchronized wmf-config/CommonSettings.php: SWAT: d450288: wgLogoHD and $wgVectorPrintLogo is replaced with wgLogos (T232140) (duration: 01m 07s) | [production] | 
            
  | 12:48 | <urbanecm@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: SWAT: 5cc2b70: wgLogoHD and $wgVectorPrintLogo is replaced with wgLogos (T232140) (duration: 01m 07s) | [production] | 
            
  | 12:32 | <awight@deploy1001> | Synchronized php-1.35.0-wmf.18/extensions/Cite: SWAT: [[gerrit:570285|Revert follow standardization (T240858)]] (duration: 01m 13s) | [production] | 
            
  | 10:53 | <akosiaris> | rolling restart of all pods on kubernetes staging cluster to make sure everything is fine after the upgrade | [production] | 
            
  | 10:50 | <akosiaris> | T244335 upgrade kubernetes-node on kubestage1002.eqiad.wmnet to 1.13.12 | [production] | 
            
  | 10:43 | <ema> | cp4028: varnish-frontend-restart T243634 | [production] | 
            
  | 10:24 | <akosiaris> | T244335 upgrade kubernetes-master on neon.eqiad.wmnet (staging) | [production] | 
            
  | 10:24 | <effie> | Upload php-apcu_5.1.17+4.0.11-1+0~20190217111312.9+stretch~1.gbp192528+wmf2 - T236800 | [production] | 
            
  | 10:10 | <Urbanecm> | Run mwscript deleteEqualMessages.php --delete to delete GrowthExperiments' message overrides (cswiki, viwiki, arwiki, kowiki) | [production] | 
            
  | 09:57 | <akosiaris> | upload kubernetes 1.13.12 to apt.wikimedia.org stretch-wikimedia/main T244335 | [production] | 
            
  | 09:51 | <effie> | install libmemcached-tools on mc-gp* servers - T240684 | [production] | 
            
  | 09:05 | <ema> | add individual FortiGate IPs hitting ulsfo (currently cp4028) to vcl blocked_nets -- trying to identify problematic traffic T243634 | [production] | 
            
  | 07:02 | <marostegui> | Replay s1 traffic on db1107 (10.4) T242702 | [production] | 
            
  | 06:32 | <elukey> | force a puppet run on ores* hosts | [production] | 
            
  | 06:12 | <marostegui> | Remove partitions from revision table db1098:3317 - T239453 | [production] | 
            
  | 06:09 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db1098:3317 - T239453', diff saved to https://phabricator.wikimedia.org/P10312 and previous config saved to /var/cache/conftool/dbconfig/20200205-060942-marostegui.json | [production] | 
            
  | 06:09 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Repool db2085:3311, db2086:3317 - T239453', diff saved to https://phabricator.wikimedia.org/P10311 and previous config saved to /var/cache/conftool/dbconfig/20200205-060911-marostegui.json | [production] | 
            
  | 02:38 | <cdanis> | T243634 ✔️ cdanis@cp4030.ulsfo.wmnet ~ 🕤🍺 sudo varnish-frontend-restart | [production] | 
            
  
    | 2020-02-04
      
      § | 
    
  | 22:35 | <twentyafterfour@deploy1001> | rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.18  refs T233866 | [production] | 
            
  | 22:13 | <twentyafterfour@deploy1001> | Finished scap: testwikis wikis to 1.35.0-wmf.18  refs T233866 (duration: 32m 03s) | [production] | 
            
  | 22:03 | <cdanis@cumin2001> | conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=eqiad | [production] | 
            
  | 21:41 | <twentyafterfour@deploy1001> | Started scap: testwikis wikis to 1.35.0-wmf.18  refs T233866 | [production] | 
            
  | 21:29 | <twentyafterfour> | preparing the new mediawiki branch for deployment to test wikis | [production] | 
            
  | 20:31 | <shdubsh> | restart kartotherian on maps2001 | [production] | 
            
  | 20:24 | <shdubsh> | temporarily enable access logs on maps2001 | [production] | 
            
  | 20:20 | <twentyafterfour> | branching mediawiki to wmf/1.35.0-wmf.18 from commit 054dd94e97d6 - train blockers should be added as subtasks under T233866 | [production] | 
            
  | 20:06 | <marxarelli> | temporarily holding 1.35.0-wmf.18 [T233866] branch cut and train due to concurrent maps prod issues | [production] | 
            
  | 19:15 | <mutante> | cp3065 - powercycling | [production] | 
            
  | 18:45 | <cdanis@cumin2001> | conftool action : set/pooled=false; selector: dnsdisc=kartotherian,name=eqiad | [production] |