| 2022-03-01
      
      § | 
    
  | 10:24 | <cmooney@cumin1001> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 10:05 | <vgutierrez> | pool cp2039 running HAProxy as TLS termination layer - T290005 T271421 | [production] | 
            
  | 09:48 | <elukey> | elukey@stat1004:~$ sudo kill `pgrep -u zpapierski` (offboarded user, puppet broken on the host) | [production] | 
            
  | 09:45 | <vgutierrez@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2039.codfw.wmnet with OS buster | [production] | 
            
  | 09:33 | <_joe_> | restarted pybal on lvs1019, removed the mw api from ipvsadm, the mw api is internally fully encrypted | [production] | 
            
  | 09:31 | <_joe_> | restart pybal on lvs1020 | [production] | 
            
  | 09:25 | <jmm@cumin2002> | END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Amuigai out of all services on: 1881 hosts | [production] | 
            
  | 09:25 | <elukey> | restart varnishkafka-webrequest on cp6009 as attempt to clear a weird status of librdkafka (delivery errors to kafka) | [production] | 
            
  | 09:25 | <_joe_> | manually removed ipvs entries on lvs2*, so it is actually now that the http api is not available in codfw anymore | [production] | 
            
  | 09:24 | <jmm@cumin2002> | START - Cookbook sre.idm.logout Logging Amuigai out of all services on: 1881 hosts | [production] | 
            
  | 09:24 | <jmm@cumin2002> | END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging ZPapierski out of all services on: 1881 hosts | [production] | 
            
  | 09:22 | <jmm@cumin2002> | START - Cookbook sre.idm.logout Logging ZPapierski out of all services on: 1881 hosts | [production] | 
            
  | 09:22 | <_joe_> | restarted pybal on lvs2009, the mw api is now effectively https-only in codfw T287820 | [production] | 
            
  | 09:20 | <_joe_> | restarted pybal on lvs2010 | [production] | 
            
  | 09:14 | <vgutierrez@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2039.codfw.wmnet with reason: host reimage | [production] | 
            
  | 09:12 | <vgutierrez@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on cp2039.codfw.wmnet with reason: host reimage | [production] | 
            
  | 09:06 | <elukey> | restart purged on cp6005 | [production] | 
            
  | 08:57 | <elukey> | restart purged on cp6004 | [production] | 
            
  | 08:54 | <vgutierrez@cumin1001> | START - Cookbook sre.hosts.reimage for host cp2039.codfw.wmnet with OS buster | [production] | 
            
  | 08:27 | <urbanecm> | UTC morning B&C window done | [production] | 
            
  | 08:25 | <elukey> | restart purged on cp6003 | [production] | 
            
  | 08:16 | <moritzm> | drain instances off ganeti2008 for eventual decom | [production] | 
            
  | 08:08 | <urbanecm@deploy1002> | Synchronized wmf-config/ProductionServices.php: d149208dfd7e5fbf51f44dd0bf7dae3b2e2f5159: Use service-proxy to connect to linkrecommendation (T302719) (duration: 00m 49s) | [production] | 
            
  | 07:59 | <elukey> | restart purged on cp6002 | [production] | 
            
  | 06:58 | <oblivian@deploy1002> | Finished deploy [restbase/deploy@0848b15] (dev-cluster): T302464 test (duration: 00m 17s) | [production] | 
            
  | 06:57 | <oblivian@deploy1002> | Started deploy [restbase/deploy@0848b15] (dev-cluster): T302464 test | [production] | 
            
  | 06:56 | <elukey> | restart purged on cp6001 to clear stale kafka TLS consumer state (or attempting to) | [production] | 
            
  | 06:46 | <_joe_> | uploaded scap 4.4.1 to {stretch,buster,bullseye} T302464 | [production] | 
            
  | 06:46 | <_joe_> | uploaded scap 4.4.1 to {stretch,buster,bullseye} | [production] | 
            
  | 02:59 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1104 (T302185)', diff saved to https://phabricator.wikimedia.org/P21618 and previous config saved to /var/cache/conftool/dbconfig/20220301-025938-ladsgroup.json | [production] | 
            
  | 02:44 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P21617 and previous config saved to /var/cache/conftool/dbconfig/20220301-024433-ladsgroup.json | [production] | 
            
  | 02:29 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P21616 and previous config saved to /var/cache/conftool/dbconfig/20220301-022928-ladsgroup.json | [production] | 
            
  | 02:14 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1104 (T302185)', diff saved to https://phabricator.wikimedia.org/P21615 and previous config saved to /var/cache/conftool/dbconfig/20220301-021424-ladsgroup.json | [production] | 
            
  | 01:14 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db1104 (T302185)', diff saved to https://phabricator.wikimedia.org/P21614 and previous config saved to /var/cache/conftool/dbconfig/20220301-011404-ladsgroup.json | [production] | 
            
  | 01:14 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1104.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 01:13 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1104.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 00:17 | <mutante> | 15.wikipedia.org on k8s (staging) deploy1002:~] $ curl -s --resolve "15.wikipedia.org:4111:staging.svc.eqiad.wmnet" 'https://15.wikipedia.org' | grep grandpa   =>  "“Wikipedia is like an all-knowing grandpa.”" | T300171 | [production] | 
            
  
    | 2022-02-28
      
      § | 
    
  | 22:36 | <ebernhardson> | start in-place reindex of kmwiki kmwiktionary and kmwikibooks on cirrus cloudelsatic cluster T299707 | [production] | 
            
  | 22:00 | <tzatziki> | running extensions/SecurePoll/cli/wm-scripts/ucoc/populateEditCount.php on each wiki (s1 thru s8 simultaneously) (T302433) | [production] | 
            
  | 21:39 | <urbanecm> | UTC late B&C window done | [production] | 
            
  | 21:38 | <urbanecm@deploy1002> | Synchronized php-1.38.0-wmf.23/extensions/VisualEditor/modules/ve-mw/init/targets: e22e4d5: b4dd4c4: VisualEditor backports (T302746) (duration: 00m 51s) | [production] | 
            
  | 21:30 | <urbanecm@deploy1002> | Synchronized php-1.38.0-wmf.23/includes/htmlform/: 67831a3: Revert "htmlform: Replace some uses of isHidden to isDisabled" (T302512) (duration: 00m 48s) | [production] | 
            
  | 21:24 | <urbanecm@deploy1002> | Synchronized php-1.38.0-wmf.23/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: 706c2bc7f86f9eadc1284c84cc6668a4e1bf5abc: Mentor dashboard: Mark mentor-tools as stable (T280307) (duration: 00m 49s) | [production] | 
            
  | 20:45 | <kharlan@deploy1002> | helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply | [production] | 
            
  | 20:45 | <kharlan@deploy1002> | helmfile [staging] START helmfile.d/services/linkrecommendation: apply | [production] | 
            
  | 20:21 | <kharlan@deploy1002> | helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply | [production] | 
            
  | 20:20 | <kharlan@deploy1002> | helmfile [staging] START helmfile.d/services/linkrecommendation: apply | [production] | 
            
  | 20:03 | <tzatziki> | creating ucoc_edits table on each wiki for elections voterlist (T302433) | [production] | 
            
  | 19:51 | <razzi@cumin1001> | END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host datahubsearch1003.eqiad.wmnet | [production] | 
            
  | 19:50 | <rzl@deploy1002> | helmfile [eqiad] DONE helmfile.d/services/miscweb: apply | [production] |