| 2020-10-29
      
      § | 
    
  | 07:56 | <vgutierrez> | set LimitNOFILE=500000 for gdnsd on authdns1001 | [production] | 
            
  | 07:54 | <marostegui> | Disconnect replication codfw -> eqiad on s4 T266663 | [production] | 
            
  | 07:50 | <vgutierrez> | restart haproxy on authdns2001 | [production] | 
            
  | 07:49 | <marostegui> | Disconnect replication codfw -> eqiad on s8 T266663 | [production] | 
            
  | 07:48 | <godog> | swift codfw-prod: bump object weight for ms-be2057 - T261633 | [production] | 
            
  | 07:46 | <marostegui> | Disconnect replication codfw -> eqiad on s3 T266663 | [production] | 
            
  | 07:43 | <vgutierrez> | restart anycast-healthchecker on authdns2001 | [production] | 
            
  | 07:34 | <vgutierrez> | set LimitNOFILE=500000 for gdnsd on authdns2001 | [production] | 
            
  | 07:27 | <elukey> | "sudo truncate -s 10g /var/log/daemon.log" on authdns2001 | [production] | 
            
  | 06:52 | <marostegui> | Disconnect replication codfw -> eqiad on s2 T266663 | [production] | 
            
  | 06:38 | <marostegui> | Disconnect replication codfw -> eqiad on s7 T266663 | [production] | 
            
  | 06:36 | <marostegui> | Disconnect replication codfw -> eqiad on s6 T266663 | [production] | 
            
  | 06:25 | <elukey> | execute 'truncate -s 10g /var/log/syslog.1 on authdns2001 - root partition full | [production] | 
            
  | 06:23 | <marostegui> | Disconnect replication codfw -> eqiad on s5 T266663 | [production] | 
            
  | 06:10 | <marostegui> | Disconnect replication codfw -> eqiad on es4 and es5 T266663 | [production] | 
            
  | 06:07 | <marostegui> | Disconnect replication codfw -> eqiad on x1 T266663 | [production] | 
            
  | 05:58 | <marostegui> | Disconnect replication codfw -> eqiad on pc1, pc2 and pc3 T266663 | [production] | 
            
  | 04:06 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.elasticsearch.rolling-restart (exit_code=0) | [production] | 
            
  | 01:41 | <mutante> | scandium reimaged a second time after making puppet changes to ensure nodejs/npm is NOT installed anymore (T257906) | [production] | 
            
  | 01:17 | <ryankemper> | T266492 Beginning rolling restart of eqiad cirrus cluster, 3 nodes at a time, on `ryankemper@cumin1001` tmux session `elasticsearch_restart_eqiad` | [production] | 
            
  | 01:16 | <ryankemper@cumin1001> | START - Cookbook sre.elasticsearch.rolling-restart | [production] | 
            
  | 00:51 | <ryankemper> | Finished restart of wdqs categories across production hosts; wdqs deploy is complete and the service is healthy | [production] | 
            
  | 00:14 | <Amir1> | rolling restart of ores | [production] | 
            
  | 00:12 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 00:10 | <dzahn@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 00:04 | <ryankemper> | Beginning restart of wdqs categories across production hosts, one at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 60 && systemctl restart wdqs-categories && sleep 30 && pool'` | [production] | 
            
  | 00:03 | <ryankemper> | Restarted wdqs categories across test hosts: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` | [production] | 
            
  | 00:03 | <ryankemper> | Restarted wdqs updater across all hosts: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` | [production] | 
            
  | 00:02 | <ryankemper> | Following wdqs deploy, https://query.wikidata.org successfully responds to an example query | [production] | 
            
  | 00:01 | <ryankemper@deploy1001> | Finished deploy [wdqs/wdqs@8c97b17]: 0.3.53 (duration: 09m 29s) | [production] | 
            
  
    | 2020-10-28
      
      § | 
    
  | 23:54 | <ryankemper> | Canary `wdqs1003` tests pass, proceeding with wdqs deploy to rest of fleet | [production] | 
            
  | 23:52 | <ryankemper@deploy1001> | Started deploy [wdqs/wdqs@8c97b17]: 0.3.53 | [production] | 
            
  | 23:52 | <ryankemper@deploy1001> | deploy aborted:  0.3.53 (duration: 00m 00s) | [production] | 
            
  | 23:52 | <ryankemper@deploy1001> | Started deploy [wdqs/wdqs@8c97b17]:  0.3.53 | [production] | 
            
  | 22:54 | <mutante> | scandium - scap pull after reinstalling OS | [production] | 
            
  | 22:14 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 22:12 | <dzahn@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 21:41 | <ryankemper> | Disabled elasticsearch "saneitizer" systemd timer in eqiad due to checker jobs falling behind: `sudo systemctl disable mediawiki_job_cirrus_sanitize_jobs.timer` on `mwmaint1002` | [production] | 
            
  | 21:22 | <herron@cumin1001> | END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) | [production] | 
            
  | 21:05 | <hnowlan@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 21:05 | <hnowlan@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 20:50 | <herron@cumin1001> | START - Cookbook sre.ganeti.makevm | [production] | 
            
  | 20:22 | <ladsgroup@deploy1001> | Synchronized static/images/project-logos: Changing logo of Wikidata for the brithday (duration: 00m 58s) | [production] | 
            
  | 19:56 | <jgleeson> | updated Smashpig from 2246685626 to 09f29c1da5 | [production] | 
            
  | 19:53 | <herron@cumin1001> | END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) | [production] | 
            
  | 19:53 | <herron@cumin1001> | START - Cookbook sre.ganeti.makevm | [production] | 
            
  | 19:50 | <herron@cumin1001> | END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) | [production] | 
            
  | 19:36 | <herron@cumin1001> | START - Cookbook sre.ganeti.makevm | [production] | 
            
  | 19:36 | <herron@cumin1001> | END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) | [production] | 
            
  | 19:36 | <herron@cumin1001> | START - Cookbook sre.ganeti.makevm | [production] |