| 2021-11-19
      
      ยง | 
    
  | 23:24 | <pt1979@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host prometheus2005.codfw.wmnet with OS bullseye | [production] | 
            
  | 23:15 | <mutante> | LDAP - added mmartorana to wmf (91354e9e-5706-4289-9a60-98e8a7632853) T295789 | [production] | 
            
  | 22:59 | <pt1979@cumin2002> | START - Cookbook sre.hosts.reimage for host prometheus2005.codfw.wmnet with OS bullseye | [production] | 
            
  | 20:24 | <pt1979@cumin2002> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2018.codfw.wmnet with OS stretch | [production] | 
            
  | 20:21 | <mutante> | phabricator - adding eigyan to WMF-NDA (phab projectt 61 - https://phabricator.wikimedia.org/project/members/61/ ) - since that is now standard when adding people to the wmf LDAP group (T295928) | [production] | 
            
  | 20:20 | <legoktm@cumin1001> | END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts thumbor2002.codfw.wmnet | [production] | 
            
  | 20:05 | <legoktm@cumin1001> | START - Cookbook sre.hosts.decommission for hosts thumbor2002.codfw.wmnet | [production] | 
            
  | 20:00 | <dzahn@cumin1001> | END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mw2280.codfw.wmnet | [production] | 
            
  | 19:55 | <pt1979@cumin2002> | START - Cookbook sre.hosts.reimage for host kubernetes2018.codfw.wmnet with OS stretch | [production] | 
            
  | 19:51 | <mutante> | shutting down undead server mw2280 - not icinga and puppetdb but in debmonitor and still has IP and puppet cert | [production] | 
            
  | 19:45 | <dzahn@cumin1001> | START - Cookbook sre.hosts.decommission for hosts mw2280.codfw.wmnet | [production] | 
            
  | 18:54 | <hnowlan@cumin1001> | END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001 | [production] | 
            
  | 18:10 | <andrew@deploy1002> | Finished deploy [horizon/deploy@ba16257]: moving the proxy endpoint behind keystone (duration: 04m 19s) | [production] | 
            
  | 18:06 | <andrew@deploy1002> | Started deploy [horizon/deploy@ba16257]: moving the proxy endpoint behind keystone | [production] | 
            
  | 17:45 | <pt1979@cumin2002> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 17:41 | <pt1979@cumin2002> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 17:25 | <andrew@deploy1002> | Finished deploy [horizon/deploy@ee83e27]: fixing sudo rule editing (duration: 04m 10s) | [production] | 
            
  | 17:21 | <andrew@deploy1002> | Started deploy [horizon/deploy@ee83e27]: fixing sudo rule editing | [production] | 
            
  | 17:19 | <mwdebug-deploy@deploy1002> | helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . | [production] | 
            
  | 17:10 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . | [production] | 
            
  | 16:54 | <mwdebug-deploy@deploy1002> | helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . | [production] | 
            
  | 16:50 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . | [production] | 
            
  | 16:42 | <thcipriani@deploy1002> | rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.38.0-wmf.9  refs T293950 T296098" | [production] | 
            
  | 16:35 | <thcipriani> | rolling back to group0 for T296098 | [production] | 
            
  | 16:20 | <hnowlan@cumin1001> | START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001 | [production] | 
            
  | 15:31 | <akosiaris> | roll restart wtp10* php7.2-fpm excluding wtp1025, wtp1041 | [production] | 
            
  | 15:29 | <akosiaris> | depooling wtp1041, wtp1025 from traffic. The entire of the parsoid cluster is in a memory pressure situation, it looks like a rolling restart of php-fpm will alleviate the pressure and gives us some time to drill more on the problem before the pressure builds up again. | [production] | 
            
  | 15:28 | <akosiaris@cumin1001> | conftool action : set/pooled=no; selector: cluster=parsoid,name=wtp1025.eqiad.wmnet | [production] | 
            
  | 15:28 | <akosiaris@cumin1001> | conftool action : set/pooled=no; selector: cluster=parsoid,name=wtp1041.eqiad.wmnet | [production] | 
            
  | 14:52 | <jmm@cumin2002> | END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti-test2001.codfw.wmnet to ganeti-test01.svc.codfw.wmnet | [production] | 
            
  | 14:49 | <jmm@cumin2002> | START - Cookbook sre.ganeti.addnode for new host ganeti-test2001.codfw.wmnet to ganeti-test01.svc.codfw.wmnet | [production] | 
            
  | 14:44 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet | [production] | 
            
  | 14:39 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet | [production] | 
            
  | 14:30 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-test2001.codfw.wmnet with OS buster | [production] | 
            
  | 14:15 | <jayme> | fleet wide updated wmf-certificates to 0~20211119-1 | [production] | 
            
  | 13:56 | <jmm@cumin2002> | START - Cookbook sre.hosts.reimage for host ganeti-test2001.codfw.wmnet with OS buster | [production] | 
            
  | 13:23 | <moritzm> | draining instances from ganeti-test2001 for reimage T284811 | [production] | 
            
  | 13:02 | <jgiannelos@deploy1002> | helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . | [production] | 
            
  | 12:10 | <jgiannelos@deploy1002> | helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . | [production] | 
            
  | 12:06 | <jgiannelos@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . | [production] | 
            
  | 11:54 | <hnowlan> | roll-restarting cassandra on eqiad maps for java updates | [production] | 
            
  | 11:36 | <jayme> | imported wmf-certificates 0~20211119-1 to stretch-wikimedia,buster-wikimedia,bullseye-wikimedia | [production] | 
            
  | 09:53 | <XioNoX> | run `commit full` on asw-b-codfw - T295118 | [production] | 
            
  | 09:30 | <XioNoX> | re-enable cr2-codfw<->asw-b7-codfw link after disabling inet6 on cr2-codfw:ae2 - T295118 | [production] | 
            
  | 09:06 | <elukey@cumin1001> | END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. | [production] | 
            
  | 08:46 | <elukey@cumin1001> | START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. | [production] | 
            
  | 08:31 | <mwdebug-deploy@deploy1002> | helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . | [production] | 
            
  | 08:30 | <ayounsi@cumin1001> | END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: update wmf-netbox - ayounsi@cumin1001 | [production] | 
            
  | 08:29 | <ayounsi@cumin1001> | START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: update wmf-netbox - ayounsi@cumin1001 | [production] | 
            
  | 08:27 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . | [production] |