2022-04-10
§
|
19:39 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1106 (T298565)', diff saved to https://phabricator.wikimedia.org/P24339 and previous config saved to /var/cache/conftool/dbconfig/20220410-193900-ladsgroup.json |
[production] |
19:38 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
19:38 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
19:38 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance |
[production] |
19:38 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance |
[production] |
18:43 |
<taavi> |
deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space) |
[tools] |
2022-04-09
§
|
19:55 |
<andrewbogott> |
reimaging cloudbackup1001-dev to bullseye |
[admin] |
19:37 |
<taavi> |
add 'puppet-enc' service & endpoint to keystone T274666 |
[admin] |
19:25 |
<andrewbogott> |
reimaging cloudbackup1002-dev to bullseye |
[admin] |
15:30 |
<taavi> |
manually prune user.log on tools-prometheus-03 to free up some space on / |
[tools] |
12:39 |
<godog> |
bounce prometheus@ops on prometheus5001 |
[production] |
12:27 |
<ariel@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1002.eqiad.wmnet |
[production] |
12:22 |
<ariel@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host dumpsdata1002.eqiad.wmnet |
[production] |
12:22 |
<ariel@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host dumpsdata1002.eqiad.wmnet |
[production] |
12:22 |
<ariel@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host dumpsdata1002.eqiad.wmnet |
[production] |
12:20 |
<ariel@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host dumpsdata1002.eqiad.wmnet |
[production] |
12:20 |
<ariel@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host dumpsdata1002.eqiad.wmnet |
[production] |
03:55 |
<legoktm> |
restarting everything to add new "wmcs" instance |
[codesearch] |
03:08 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298565)', diff saved to https://phabricator.wikimedia.org/P24337 and previous config saved to /var/cache/conftool/dbconfig/20220409-030854-ladsgroup.json |
[production] |
02:53 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P24336 and previous config saved to /var/cache/conftool/dbconfig/20220409-025349-ladsgroup.json |
[production] |
02:38 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P24335 and previous config saved to /var/cache/conftool/dbconfig/20220409-023843-ladsgroup.json |
[production] |
02:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298565)', diff saved to https://phabricator.wikimedia.org/P24334 and previous config saved to /var/cache/conftool/dbconfig/20220409-022338-ladsgroup.json |
[production] |
00:53 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1106 (T298565)', diff saved to https://phabricator.wikimedia.org/P24333 and previous config saved to /var/cache/conftool/dbconfig/20220409-005351-ladsgroup.json |
[production] |
00:53 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
00:53 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
00:53 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance |
[production] |
00:53 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance |
[production] |
00:53 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298565)', diff saved to https://phabricator.wikimedia.org/P24332 and previous config saved to /var/cache/conftool/dbconfig/20220409-005338-ladsgroup.json |
[production] |
00:38 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P24331 and previous config saved to /var/cache/conftool/dbconfig/20220409-003832-ladsgroup.json |
[production] |
00:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P24330 and previous config saved to /var/cache/conftool/dbconfig/20220409-002327-ladsgroup.json |
[production] |
00:08 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298565)', diff saved to https://phabricator.wikimedia.org/P24329 and previous config saved to /var/cache/conftool/dbconfig/20220409-000822-ladsgroup.json |
[production] |
2022-04-08
§
|
22:53 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1169 (T298565)', diff saved to https://phabricator.wikimedia.org/P24328 and previous config saved to /var/cache/conftool/dbconfig/20220408-225350-ladsgroup.json |
[production] |
22:53 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance |
[production] |
22:53 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance |
[production] |
22:53 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298565)', diff saved to https://phabricator.wikimedia.org/P24327 and previous config saved to /var/cache/conftool/dbconfig/20220408-225342-ladsgroup.json |
[production] |
22:38 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P24326 and previous config saved to /var/cache/conftool/dbconfig/20220408-223837-ladsgroup.json |
[production] |
22:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P24325 and previous config saved to /var/cache/conftool/dbconfig/20220408-222332-ladsgroup.json |
[production] |
22:09 |
<mutante> |
gitlab - deleted runner-1008 (to replace it with a bullseye instance), recreated runner-1020 with same flavor as existing runners T297659 |
[production] |
22:08 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298565)', diff saved to https://phabricator.wikimedia.org/P24324 and previous config saved to /var/cache/conftool/dbconfig/20220408-220827-ladsgroup.json |
[production] |
22:03 |
<mutante> |
- deleting instance runner-1020 and recreating it with the same name but flavor g3.cores8.ram24.disk20 T297659 |
[gitlab-runners] |
22:01 |
<mutante> |
- deleting instance runner-1008 in Horizon and also deleting it in gitlab admin UI about the same time T297659 |
[gitlab-runners] |
20:59 |
<mutante> |
- pausing runner-1008 from accepting new jobs, hoping it will finish all existing jobs already queued and once that is down to 0 I can replace it with a new runner on bullseye (T297659) |
[gitlab-runners] |
20:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1164 (T298565)', diff saved to https://phabricator.wikimedia.org/P24323 and previous config saved to /var/cache/conftool/dbconfig/20220408-204138-ladsgroup.json |
[production] |
20:41 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance |
[production] |
20:41 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance |
[production] |
20:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298565)', diff saved to https://phabricator.wikimedia.org/P24322 and previous config saved to /var/cache/conftool/dbconfig/20220408-204129-ladsgroup.json |
[production] |
20:26 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P24321 and previous config saved to /var/cache/conftool/dbconfig/20220408-202624-ladsgroup.json |
[production] |
20:11 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P24320 and previous config saved to /var/cache/conftool/dbconfig/20220408-201119-ladsgroup.json |
[production] |
19:56 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298565)', diff saved to https://phabricator.wikimedia.org/P24319 and previous config saved to /var/cache/conftool/dbconfig/20220408-195614-ladsgroup.json |
[production] |
18:38 |
<mutante> |
gitlab1001 - giving myself gitlab admin rights via rake console, to be able to connect/disconnect runners T297659 |
[production] |