| 2021-07-27
      
      § | 
    
  | 10:39 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw1269.eqiad.wmnet | [production] | 
            
  | 10:16 | <jelto> | gitlab-ansible playbook on gitlab2001.wikimedia.org END (PASS) | [production] | 
            
  | 10:11 | <mutante> | replacing scap proxies: mw1269 with mw1420, mw1285 with mw1306 | [production] | 
            
  | 10:10 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db2147 (re)pooling @ 100%: After mariadb restart and upgraed', diff saved to https://phabricator.wikimedia.org/P16909 and previous config saved to /var/cache/conftool/dbconfig/20210727-101053-root.json | [production] | 
            
  | 10:10 | <dzahn@cumin1001> | START - Cookbook sre.hosts.decommission for hosts mw1269.eqiad.wmnet | [production] | 
            
  | 10:06 | <jelto> | running gitlab-ansible playbook on gitlab2001.wikimedia.org | [production] | 
            
  | 10:05 | <dzahn@cumin1001> | conftool action : set/pooled=inactive; selector: name=mw1269.eqiad.wmnet | [production] | 
            
  | 09:55 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db2147 (re)pooling @ 75%: After mariadb restart and upgraed', diff saved to https://phabricator.wikimedia.org/P16908 and previous config saved to /var/cache/conftool/dbconfig/20210727-095549-root.json | [production] | 
            
  | 09:52 | <jynus> | reverting query killer parameters on s3 codfw replicas | [production] | 
            
  | 09:41 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw1285.eqiad.wmnet | [production] | 
            
  | 09:40 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db2147 (re)pooling @ 50%: After mariadb restart and upgraed', diff saved to https://phabricator.wikimedia.org/P16906 and previous config saved to /var/cache/conftool/dbconfig/20210727-094046-root.json | [production] | 
            
  | 09:25 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db2147 (re)pooling @ 25%: After mariadb restart and upgraed', diff saved to https://phabricator.wikimedia.org/P16905 and previous config saved to /var/cache/conftool/dbconfig/20210727-092542-root.json | [production] | 
            
  | 09:13 | <dzahn@cumin1001> | START - Cookbook sre.hosts.decommission for hosts mw1285.eqiad.wmnet | [production] | 
            
  | 09:12 | <dzahn@cumin1001> | conftool action : set/pooled=inactive; selector: name=mw1285.eqiad.wmnet | [production] | 
            
  | 09:10 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db2147 (re)pooling @ 15%: After mariadb restart and upgraed', diff saved to https://phabricator.wikimedia.org/P16904 and previous config saved to /var/cache/conftool/dbconfig/20210727-091038-root.json | [production] | 
            
  | 09:04 | <_joe_> | restarting pybal on lvs2009 to pick up the new api depool threshold | [production] | 
            
  | 08:57 | <_joe_> | repooling mw225[12] for apis | [production] | 
            
  | 08:56 | <_joe_> | restart pybal on lvs2010 to pick up the depool threshold change | [production] | 
            
  | 08:55 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db2147 (re)pooling @ 10%: After mariadb restart and upgraed', diff saved to https://phabricator.wikimedia.org/P16902 and previous config saved to /var/cache/conftool/dbconfig/20210727-085535-root.json | [production] | 
            
  | 08:40 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db2147 (re)pooling @ 5%: After mariadb restart and upgraed', diff saved to https://phabricator.wikimedia.org/P16901 and previous config saved to /var/cache/conftool/dbconfig/20210727-084031-root.json | [production] | 
            
  | 08:36 | <jynus> | reenabled puppet on mwmaint1002 | [production] | 
            
  | 08:29 | <volans@deploy1002> | Finished deploy [netbox/deploy@660ad14]: Test v2.10.4-wmf5 on netbox-next (duration: 01m 01s) | [production] | 
            
  | 08:28 | <volans@deploy1002> | Started deploy [netbox/deploy@660ad14]: Test v2.10.4-wmf5 on netbox-next | [production] | 
            
  | 08:28 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db2147 to restart mysql', diff saved to https://phabricator.wikimedia.org/P16900 and previous config saved to /var/cache/conftool/dbconfig/20210727-082820-marostegui.json | [production] | 
            
  | 07:52 | <jynus> | disabling puppet on mwmaint1002 | [production] | 
            
  | 07:14 | <moritzm> | installing krb security updates on buster | [production] | 
            
  | 06:50 | <elukey> | install iptables from buster-backports (manually) on ml-serve-ctrl200[1,2] as test (+ reboot the nodes for a clean start) - T287238 | [production] | 
            
  | 06:20 | <ladsgroup@deploy1002> | Synchronized wmf-config/Wikibase.php: Config: [[gerrit:708204|Enable request language for RDF stubs in testwikidatawiki (T285795)]], Part II (duration: 00m 56s) | [production] | 
            
  | 06:18 | <ladsgroup@deploy1002> | Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:708204|Enable request language for RDF stubs in testwikidatawiki (T285795)]], Part I (duration: 00m 57s) | [production] | 
            
  | 05:34 | <marostegui@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1162.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 05:32 | <marostegui@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 05:12 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db1162 T287230', diff saved to https://phabricator.wikimedia.org/P16899 and previous config saved to /var/cache/conftool/dbconfig/20210727-051212-marostegui.json | [production] | 
            
  
    | 2021-07-26
      
      § | 
    
  | 23:37 | <legoktm@deploy1002> | Synchronized php-1.37.0-wmf.15/extensions/Score/includes/Score.php: Increase lilypond version cache TTL to 1 hour (duration: 00m 57s) | [production] | 
            
  | 18:30 | <cstone> | SmashPig revision changed from be272c02ce to 020d4eccd4, | [production] | 
            
  | 17:41 | <legoktm> | ran `scap pull` and repooled mw2336.codfw.wmnet - T287394 | [production] | 
            
  | 17:41 | <legoktm@cumin1001> | conftool action : set/pooled=yes; selector: name=mw2336.codfw.wmnet | [production] | 
            
  | 17:40 | <jynus@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1002.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 17:38 | <jynus@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1002.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 16:06 | <legoktm> | depooled mw2336.codfw.mwnet, mgmt is down too. T287394 | [production] | 
            
  | 16:04 | <legoktm@cumin1001> | conftool action : set/pooled=no; selector: name=mw2336.codfw.wmnet | [production] | 
            
  | 15:29 | <hashar> | Restarted gerrit replica on gerrit2001.wikimedia.org # T287122 | [production] | 
            
  | 15:24 | <ladsgroup@deploy1002> | Synchronized php-1.37.0-wmf.15/extensions/AbuseFilter/includes/AbuseFilterHooks.php: Backport: [[gerrit:707021|Don’t generate current content text twice]], Part II (duration: 01m 49s) | [production] | 
            
  | 15:21 | <ladsgroup@deploy1002> | Synchronized php-1.37.0-wmf.15/extensions/AbuseFilter/includes/VariableGenerator/RunVariableGenerator.php: Backport: [[gerrit:707021|Don’t generate current content text twice]], Part I (duration: 01m 50s) | [production] | 
            
  | 15:19 | <topranks> | Adding peering to AS139931 - Bangladesh Submarine Cable Company - at Equinix Singapore on cr3-eqsin | [production] | 
            
  | 14:42 | <dcausse@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . | [production] | 
            
  | 13:42 | <oblivian@deploy1002> | helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . | [production] | 
            
  | 10:55 | <ladsgroup@deploy1002> | Synchronized wmf-config/InitialiseSettings.php: Disable DPL on ruwikinews (duration: 00m 27s) | [production] | 
            
  | 10:53 | <ladsgroup@deploy1002> | Scap failed!: 3/6 canaries failed their endpoint checks(https://en.wikipedia.org) | [production] | 
            
  | 10:52 | <ladsgroup@deploy1002> | Scap failed!: 2/6 canaries failed their endpoint checks(https://en.wikipedia.org) | [production] | 
            
  | 10:51 | <jynus> | deploying 10 second mw user query limit on s3 codfw replicas | [production] |