| 2021-07-28
      
      § | 
    
  | 14:58 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1434.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 14:56 | <dzahn@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on mw1434.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 14:39 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 14:33 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1434.eqiad.wmnet with reason: known issue | [production] | 
            
  | 14:33 | <dzahn@cumin1001> | START - Cookbook sre.hosts.downtime for 4:00:00 on mw1434.eqiad.wmnet with reason: known issue | [production] | 
            
  | 14:19 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. | [production] | 
            
  | 14:06 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1436.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 14:06 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 14:06 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. | [production] | 
            
  | 14:06 | <dcausse@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . | [production] | 
            
  | 14:04 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1435.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 14:03 | <dzahn@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on mw1436.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 14:01 | <dzahn@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on mw1435.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 13:32 | <dzahn@cumin1001> | conftool action : set/pooled=inactive; selector: name=mw143[4-6].eqiad.wmnet | [production] | 
            
  | 13:29 | <moritzm> | installing python2.7 security updates on stretch | [production] | 
            
  | 13:08 | <moritzm> | installing python3.5 security updates on stretch | [production] | 
            
  | 12:27 | <dcausse@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . | [production] | 
            
  | 11:26 | <moritzm> | installing nginx security updates on thumbor* | [production] | 
            
  | 11:18 | <moritzm> | installing nginx security updates on sodium (mirrors.wikimedia.org) | [production] | 
            
  | 11:03 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue | [production] | 
            
  | 11:03 | <dzahn@cumin1001> | START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue | [production] | 
            
  | 10:11 | <moritzm> | installing remaining nginx security updates on stretch | [production] | 
            
  | 10:09 | <godog> | temp fix prometheus-icinga-am on alert1001 | [production] | 
            
  | 09:40 | <dcausse@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . | [production] | 
            
  | 09:40 | <urbanecm> | Start server-side upload for 1 video file (T287482) | [production] | 
            
  | 09:29 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. | [production] | 
            
  | 09:29 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 09:28 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. | [production] | 
            
  | 09:24 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 09:24 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. | [production] | 
            
  | 08:33 | <marostegui@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1122.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 08:31 | <marostegui@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on db1122.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 08:27 | <Amir1> | running several long-running queries against pc1007 | [production] | 
            
  | 08:13 | <oblivian@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . | [production] | 
            
  | 08:01 | <dcausse@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . | [production] | 
            
  | 07:53 | <moritzm> | installing aspell security updates on stretch | [production] | 
            
  | 07:20 | <dcaro@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 29 hosts with reason: T287559 | [production] | 
            
  | 07:20 | <dcaro@cumin1001> | START - Cookbook sre.hosts.downtime for 5:00:00 on 29 hosts with reason: T287559 | [production] | 
            
  | 07:20 | <dcaro@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 40 hosts with reason: T287559 | [production] | 
            
  | 07:20 | <dcaro@cumin1001> | START - Cookbook sre.hosts.downtime for 5:00:00 on 40 hosts with reason: T287559 | [production] | 
            
  | 07:20 | <dcaro@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 6 hosts with reason: T287559 | [production] | 
            
  | 07:20 | <dcaro@cumin1001> | START - Cookbook sre.hosts.downtime for 5:00:00 on 6 hosts with reason: T287559 | [production] | 
            
  | 07:07 | <godog> | remove cloud*/syslog.log from centrallog2001 - T287559 | [production] | 
            
  | 07:06 | <godog> | remove node_pinger.prom from node-pinger hosts | [production] | 
            
  | 06:42 | <godog> | remove obsolete user.log.manual-rotation from centrallog1001 to free disk space | [production] | 
            
  | 02:43 | <TimStarling> | on mwmaint2002 fixing T286273 broken files using eval.php | [production] |