3501-3550 of 10000 results (93ms)
2023-02-08 ยง
12:03 <eoghan@cumin1001> START - Cookbook sre.hosts.reimage for host gitlab-runner1002.eqiad.wmnet with OS bullseye [production]
11:59 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dse-k8s-worker1001.eqiad.wmnet with reason: Attempting to move some GPUs [production]
11:59 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on dse-k8s-worker1001.eqiad.wmnet with reason: Attempting to move some GPUs [production]
11:58 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-worker1097.eqiad.wmnet with reason: Attempting to move some GPUs [production]
11:57 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on an-worker1097.eqiad.wmnet with reason: Attempting to move some GPUs [production]
11:57 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-worker1096.eqiad.wmnet with reason: Attempting to move some GPUs [production]
11:57 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on an-worker1096.eqiad.wmnet with reason: Attempting to move some GPUs [production]
11:56 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: moss-be1001.eqiad.wmnet [production]
11:56 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: moss-be1001.eqiad.wmnet [production]
11:55 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P43799 and previous config saved to /var/cache/conftool/dbconfig/20230208-115546-marostegui.json [production]
11:53 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: flowspec1001.eqiad.wmnet [production]
11:53 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: flowspec1001.eqiad.wmnet [production]
11:40 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1165 (T328817)', diff saved to https://phabricator.wikimedia.org/P43798 and previous config saved to /var/cache/conftool/dbconfig/20230208-114040-marostegui.json [production]
11:38 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1165 (T328817)', diff saved to https://phabricator.wikimedia.org/P43797 and previous config saved to /var/cache/conftool/dbconfig/20230208-113832-marostegui.json [production]
11:38 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
11:38 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
11:38 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance [production]
11:37 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance [production]
11:13 <marostegui> Stop mysql on db1096 (s5,s6) T329147 [production]
11:05 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance [production]
11:05 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance [production]
11:05 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T328817)', diff saved to https://phabricator.wikimedia.org/P43796 and previous config saved to /var/cache/conftool/dbconfig/20230208-110507-marostegui.json [production]
10:57 <zabe@deploy1002> Finished scap: Backport for [[gerrit:887748|Remove cul_reason comment table migration code (T233004 T329151)]] (duration: 08m 05s) [production]
10:51 <zabe@deploy1002> zabe: Backport for [[gerrit:887748|Remove cul_reason comment table migration code (T233004 T329151)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
10:50 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P43793 and previous config saved to /var/cache/conftool/dbconfig/20230208-105001-marostegui.json [production]
10:49 <zabe@deploy1002> Started scap: Backport for [[gerrit:887748|Remove cul_reason comment table migration code (T233004 T329151)]] [production]
10:38 <jmm@cumin2002> END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe [production]
10:35 <jmm@cumin2002> START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe [production]
10:34 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P43791 and previous config saved to /var/cache/conftool/dbconfig/20230208-103455-marostegui.json [production]
10:33 <volans> deploying python3-wmflib_1.2.1 to the fleet [production]
10:28 <zabe@deploy1002> Finished scap: Backport for [[gerrit:887747|Revert "slwiki: Raise AF emergency disable treshold+count" (T328366)]] (duration: 08m 49s) [production]
10:26 <marostegui> Failover m2-master from dbproxy1013 to dbproxy1015 T329073 [production]
10:21 <zabe@deploy1002> zabe: Backport for [[gerrit:887747|Revert "slwiki: Raise AF emergency disable treshold+count" (T328366)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet [production]
10:19 <zabe@deploy1002> Started scap: Backport for [[gerrit:887747|Revert "slwiki: Raise AF emergency disable treshold+count" (T328366)]] [production]
10:19 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T328817)', diff saved to https://phabricator.wikimedia.org/P43790 and previous config saved to /var/cache/conftool/dbconfig/20230208-101948-marostegui.json [production]
10:15 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1113:3316 (T328817)', diff saved to https://phabricator.wikimedia.org/P43789 and previous config saved to /var/cache/conftool/dbconfig/20230208-101534-marostegui.json [production]
10:15 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance [production]
10:15 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance [production]
10:15 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T328817)', diff saved to https://phabricator.wikimedia.org/P43788 and previous config saved to /var/cache/conftool/dbconfig/20230208-101512-marostegui.json [production]
10:08 <phedenskog@deploy1002> Finished deploy [performance/navtiming@079891a]: (no justification provided) (duration: 00m 08s) [production]
10:08 <phedenskog@deploy1002> Started deploy [performance/navtiming@079891a]: (no justification provided) [production]
10:07 <jelto@cumin1001> END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Test Upgrade GitLab Replica gitlab1003 with invalid version [production]
10:07 <jelto@cumin1001> START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Test Upgrade GitLab Replica gitlab1003 with invalid version [production]
10:00 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P43787 and previous config saved to /var/cache/conftool/dbconfig/20230208-100006-marostegui.json [production]
09:59 <moritzm> installing openssl security updates on bullseye [production]
09:52 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove db1096 (s5,s6) from dbctl T329147', diff saved to https://phabricator.wikimedia.org/P43786 and previous config saved to /var/cache/conftool/dbconfig/20230208-095207-marostegui.json [production]
09:45 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P43785 and previous config saved to /var/cache/conftool/dbconfig/20230208-094500-marostegui.json [production]
09:29 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T328817)', diff saved to https://phabricator.wikimedia.org/P43783 and previous config saved to /var/cache/conftool/dbconfig/20230208-092954-marostegui.json [production]
09:14 <godog> purge user_auth table on grafana1002 - T328784 [production]
08:54 <moritzm> installing imagemagick security updates [production]