4051-4100 of 10000 results (62ms)
2022-06-15 §
07:50 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db1099.eqiad.wmnet with reason: Maintenance [production]
07:47 <marostegui@cumin1001> dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P29755 and previous config saved to /var/cache/conftool/dbconfig/20220615-074736-root.json [production]
07:35 <marostegui@cumin1001> dbctl commit (dc=all): 'db1148 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P29754 and previous config saved to /var/cache/conftool/dbconfig/20220615-073538-root.json [production]
07:32 <marostegui@cumin1001> dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P29753 and previous config saved to /var/cache/conftool/dbconfig/20220615-073232-root.json [production]
07:24 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance [production]
07:23 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance [production]
07:23 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T310011)', diff saved to https://phabricator.wikimedia.org/P29752 and previous config saved to /var/cache/conftool/dbconfig/20220615-072352-marostegui.json [production]
07:20 <marostegui@cumin1001> dbctl commit (dc=all): 'db1148 (re)pooling @ 5%: After schema change', diff saved to https://phabricator.wikimedia.org/P29751 and previous config saved to /var/cache/conftool/dbconfig/20220615-072034-root.json [production]
07:17 <marostegui@cumin1001> dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P29750 and previous config saved to /var/cache/conftool/dbconfig/20220615-071728-root.json [production]
07:08 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P29749 and previous config saved to /var/cache/conftool/dbconfig/20220615-070847-marostegui.json [production]
06:53 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P29748 and previous config saved to /var/cache/conftool/dbconfig/20220615-065342-marostegui.json [production]
06:52 <XioNoX> disable BGP to Telia in eqsin for optic replacement - T300485 [production]
06:38 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T310011)', diff saved to https://phabricator.wikimedia.org/P29747 and previous config saved to /var/cache/conftool/dbconfig/20220615-063837-marostegui.json [production]
06:02 <marostegui> Reboot db[2071-2078] T310485 [production]
06:01 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1105:3311 (T310011)', diff saved to https://phabricator.wikimedia.org/P29746 and previous config saved to /var/cache/conftool/dbconfig/20220615-060153-marostegui.json [production]
06:01 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1105.eqiad.wmnet with reason: Maintenance [production]
06:01 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db1105.eqiad.wmnet with reason: Maintenance [production]
05:42 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1098:3317 (T302659)', diff saved to https://phabricator.wikimedia.org/P29745 and previous config saved to /var/cache/conftool/dbconfig/20220615-054252-marostegui.json [production]
05:42 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance [production]
05:42 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance [production]
05:34 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance [production]
05:34 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance [production]
05:23 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1173.eqiad.wmnet with OS bullseye [production]
05:17 <marostegui> dbmaint es5@codfw T310485 [production]
05:17 <marostegui> dbmaint es4@codfw T310485 [production]
05:17 <marostegui> dbmaint es3@codfw T310485 [production]
05:17 <marostegui> dbmaint es2@codfw T310485 [production]
05:17 <marostegui> dbmaint es1@codfw T310485 [production]
05:07 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage [production]
05:04 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage [production]
05:03 <marostegui> Reboot dbproxy1016 and dbproxy1021 T310484 [production]
04:53 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host db1173.eqiad.wmnet with OS bullseye [production]
02:31 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
02:30 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
02:30 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
02:29 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
02:25 <tstarling@deploy1002> Synchronized php-1.39.0-wmf.16/includes/cache/MessageCache.php: (no justification provided) (duration: 03m 36s) [production]
02:24 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
02:21 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
02:21 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
02:17 <tstarling@deploy1002> Synchronized php-1.39.0-wmf.15/includes/cache/MessageCache.php: T310532 (duration: 03m 29s) [production]
02:17 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
2022-06-14 §
23:52 <mutante> gitlab-runner1001/1002 - clean revert not possible, icinga alerting about failed buildkitd service, manually deleting systemd unit and trying to clean up T308271 [production]
23:49 <mutante> gitlab-runner1002 - systemctl restart docker; run-puppet-agent ; systemctl start buildkitd - fails though T308271 [production]
23:39 <mutante> gitlab-runner1001 - systemctl start buildkitd [production]
23:32 <mutante> gitlab-runner1001 - restarting docker [production]
23:08 <mutante> disabling puppet in gitlab-runners (via cumin /disable-puppet) before deploying gerrit:791655 to provide gitlab-runners with buildkit and new docker network - T308271 [production]
22:19 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
22:18 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
22:18 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]