| 
      
        2021-07-13
      
      §
     | 
  
    
  | 15:19 | 
  <otto@deploy1002> | 
  helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' . | 
  [production] | 
            
  | 14:52 | 
  <volker-e@deploy1002> | 
  Finished deploy [design/style-guide@5c07233]: Deploy design/style-guide: 5c07233 “Components”: Add WikimediaUI theme Figma links to various components (#483) (duration: 00m 06s) | 
  [production] | 
            
  | 14:52 | 
  <volker-e@deploy1002> | 
  Started deploy [design/style-guide@5c07233]: Deploy design/style-guide: 5c07233 “Components”: Add WikimediaUI theme Figma links to various components (#483) | 
  [production] | 
            
  | 14:35 | 
  <nskaggs@cumin1001> | 
  END (FAIL) - Cookbook wmcs.wikireplicas.add_wiki (exit_code=99) | 
  [production] | 
            
  | 14:35 | 
  <nskaggs@cumin1001> | 
  START - Cookbook wmcs.wikireplicas.add_wiki | 
  [production] | 
            
  | 13:57 | 
  <otto@deploy1002> | 
  Finished deploy [analytics/refinery@a3bc8bc]: Add eventlogging_legacy gobblin job  - T271232 (duration: 03m 28s) | 
  [production] | 
            
  | 13:53 | 
  <otto@deploy1002> | 
  Started deploy [analytics/refinery@a3bc8bc]: Add eventlogging_legacy gobblin job  - T271232 | 
  [production] | 
            
  | 13:37 | 
  <effie> | 
  rolling restart php-fpm across clusters - T286260 | 
  [production] | 
            
  | 13:33 | 
  <ladsgroup@deploy1002> | 
  Synchronized php-1.37.0-wmf.12/extensions/Wikibase/lib/includes/SimpleCacheWithBagOStuff.php: Backport: [[gerrit:704176|Send TTL instead of expiry in unix timestamp in calling BagOStuff (T286260)]] (duration: 00m 58s) | 
  [production] | 
            
  | 13:30 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 2 hosts | 
  [production] | 
            
  | 13:29 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 2 hosts | 
  [production] | 
            
  | 13:14 | 
  <kormat> | 
  restarted replication on db1117:3325 T284622 | 
  [production] | 
            
  | 13:11 | 
  <jmm@cumin2002> | 
  END (FAIL) - Cookbook sre.idm.logout (exit_code=99) Logging Muehlenhoff out of all services on: 1732 hosts | 
  [production] | 
            
  | 13:10 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1732 hosts | 
  [production] | 
            
  | 13:10 | 
  <hashar> | 
  Upgraded Apache on gerrit1001 and gerrit2001 | 
  [production] | 
            
  | 13:09 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 1732 hosts | 
  [production] | 
            
  | 13:08 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1732 hosts | 
  [production] | 
            
  | 12:55 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 1732 hosts | 
  [production] | 
            
  | 12:53 | 
  <kormat> | 
  stopping replication on db1117:3325 T284622 | 
  [production] | 
            
  | 12:53 | 
  <kormat@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1117.eqiad.wmnet with reason: Copy m5 from db1117 to db1183 T284622 | 
  [production] | 
            
  | 12:53 | 
  <kormat@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 4:00:00 on db1117.eqiad.wmnet with reason: Copy m5 from db1117 to db1183 T284622 | 
  [production] | 
            
  | 12:43 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1732 hosts | 
  [production] | 
            
  | 12:41 | 
  <mutante> | 
  depooling and decom'ing eqiad API servers mw1281, mw1282, mw1283 - T280203 | 
  [production] | 
            
  | 12:40 | 
  <dzahn@cumin1001> | 
  conftool action : set/pooled=no; selector: name=mw128[1-3].eqiad.wmnet | 
  [production] | 
            
  | 12:20 | 
  <mutante> | 
  mwmaint1002 - scap pull after reimaging | 
  [production] | 
            
  | 11:33 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwmaint1002.eqiad.wmnet with reason: REIMAGE | 
  [production] | 
            
  | 11:31 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on mwmaint1002.eqiad.wmnet with reason: REIMAGE | 
  [production] | 
            
  | 11:28 | 
  <Lucas_WMDE> | 
  EU backport+config window done | 
  [production] | 
            
  | 11:25 | 
  <lucaswerkmeister-wmde@deploy1002> | 
  Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:704304|Remove obsolete $wgShowDBErrorBacktrace config]] (duration: 01m 25s) | 
  [production] | 
            
  | 11:13 | 
  <mutante> | 
  mwmaint1002 - reimaging with buster (T267607) | 
  [production] | 
            
  | 10:54 | 
  <mutante> | 
  switching https://noc.wikimedia.org backened from eqiad to codfw for mwmaint1002 OS upgrade, not affecting config-master/pybal, tests passed (T267607) | 
  [production] | 
            
  | 10:44 | 
  <moritzm> | 
  upgrading apache on phab1001 (phabricator.wikimedia.org) | 
  [production] | 
            
  | 10:39 | 
  <hnowlan@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps2008.codfw.wmnet with reason: reimaging as buster replica | 
  [production] | 
            
  | 10:39 | 
  <hnowlan@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps2008.codfw.wmnet with reason: reimaging as buster replica | 
  [production] | 
            
  | 10:39 | 
  <hnowlan> | 
  running `nodetool decommission` on maps2008 | 
  [production] | 
            
  | 10:27 | 
  <moritzm> | 
  installing apache security updates on alert1001 (icinga.wikimedia.org) | 
  [production] | 
            
  | 10:21 | 
  <kormat@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 18 hosts with reason: Deploying schema change to s1 T277116 | 
  [production] | 
            
  | 10:21 | 
  <kormat@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 4:00:00 on 18 hosts with reason: Deploying schema change to s1 T277116 | 
  [production] | 
            
  | 10:18 | 
  <moritzm> | 
  installing apache security updates on Logstash hosts | 
  [production] | 
            
  | 09:58 | 
  <moritzm> | 
  upgrading PHP/Apache on matomo1002 (piwik.wikimedia.org) | 
  [production] | 
            
  | 09:40 | 
  <moritzm> | 
  installing apache security updates on thanos-fe hosts | 
  [production] | 
            
  | 09:38 | 
  <moritzm> | 
  installing apache security updates on parsoid hosts | 
  [production] | 
            
  | 09:31 | 
  <effie> | 
  depool mw2383 T286463 | 
  [production] | 
            
  | 09:18 | 
  <volans@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1001.eqiad.wmnet | 
  [production] | 
            
  | 09:15 | 
  <volans@cumin2002> | 
  START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet | 
  [production] | 
            
  | 09:00 | 
  <kormat@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 12 hosts with reason: Deploying schema change to s3 T277116 | 
  [production] | 
            
  | 09:00 | 
  <kormat@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 4:00:00 on 12 hosts with reason: Deploying schema change to s3 T277116 | 
  [production] | 
            
  | 08:59 | 
  <volans@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook | 
  [production] | 
            
  | 08:59 | 
  <volans@cumin2002> | 
  START - Cookbook sre.hosts.downtime for 0:10:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook | 
  [production] | 
            
  | 08:45 | 
  <effie> | 
  depool mw2383 - T286463 | 
  [production] |