3351-3400 of 10000 results (46ms)
2022-04-18 ยง
19:08 <mutante> - gitlab-prod-1001 is indeed back after soft rebooting the instance. uptime 1 min T297411 [devtools]
19:07 <mutante> - gitlab-prod-1001 randomly stopped working. we got the "puppet failed" mails without having made changes and can't ssh to the instance anymore when trying to check out why. trying soft reboot via Horizon T297411 [devtools]
19:06 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25157 and previous config saved to /var/cache/conftool/dbconfig/20220418-190640-ladsgroup.json [production]
19:06 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [production]
19:06 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [production]
19:06 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25156 and previous config saved to /var/cache/conftool/dbconfig/20220418-190632-ladsgroup.json [production]
19:04 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1179 (T298565)', diff saved to https://phabricator.wikimedia.org/P25155 and previous config saved to /var/cache/conftool/dbconfig/20220418-190452-ladsgroup.json [production]
19:04 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance [production]
19:04 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance [production]
19:04 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1157 (T298565)', diff saved to https://phabricator.wikimedia.org/P25154 and previous config saved to /var/cache/conftool/dbconfig/20220418-190444-ladsgroup.json [production]
18:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P25153 and previous config saved to /var/cache/conftool/dbconfig/20220418-185126-ladsgroup.json [production]
18:49 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P25152 and previous config saved to /var/cache/conftool/dbconfig/20220418-184939-ladsgroup.json [production]
18:43 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1147 (T298565)', diff saved to https://phabricator.wikimedia.org/P25151 and previous config saved to /var/cache/conftool/dbconfig/20220418-184325-ladsgroup.json [production]
18:43 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance [production]
18:43 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance [production]
18:43 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25150 and previous config saved to /var/cache/conftool/dbconfig/20220418-184317-ladsgroup.json [production]
18:36 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P25149 and previous config saved to /var/cache/conftool/dbconfig/20220418-183621-ladsgroup.json [production]
18:34 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P25148 and previous config saved to /var/cache/conftool/dbconfig/20220418-183434-ladsgroup.json [production]
18:34 <Rook> update phab links to prefilled ticket https://phabricator.wikimedia.org/T303028 [quarry]
18:28 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P25147 and previous config saved to /var/cache/conftool/dbconfig/20220418-182812-ladsgroup.json [production]
18:21 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25146 and previous config saved to /var/cache/conftool/dbconfig/20220418-182116-ladsgroup.json [production]
18:19 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1157 (T298565)', diff saved to https://phabricator.wikimedia.org/P25145 and previous config saved to /var/cache/conftool/dbconfig/20220418-181929-ladsgroup.json [production]
18:13 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P25144 and previous config saved to /var/cache/conftool/dbconfig/20220418-181307-ladsgroup.json [production]
17:58 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25143 and previous config saved to /var/cache/conftool/dbconfig/20220418-175802-ladsgroup.json [production]
17:47 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25142 and previous config saved to /var/cache/conftool/dbconfig/20220418-174704-ladsgroup.json [production]
17:47 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [production]
17:47 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [production]
17:46 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298565)', diff saved to https://phabricator.wikimedia.org/P25141 and previous config saved to /var/cache/conftool/dbconfig/20220418-174656-ladsgroup.json [production]
17:31 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P25140 and previous config saved to /var/cache/conftool/dbconfig/20220418-173151-ladsgroup.json [production]
17:30 <Rook> exposing query history https://phabricator.wikimedia.org/T100982 [quarry]
17:29 <Rook> updating links to phab with prefilled ticket links aef7c671a69a66a9872a48a24169f1f6bf2ffc4f [paws]
17:21 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25139 and previous config saved to /var/cache/conftool/dbconfig/20220418-172101-ladsgroup.json [production]
17:20 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance [production]
17:20 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance [production]
17:19 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1157 (T298565)', diff saved to https://phabricator.wikimedia.org/P25138 and previous config saved to /var/cache/conftool/dbconfig/20220418-171914-ladsgroup.json [production]
17:19 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance [production]
17:19 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance [production]
17:19 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298565)', diff saved to https://phabricator.wikimedia.org/P25137 and previous config saved to /var/cache/conftool/dbconfig/20220418-171906-ladsgroup.json [production]
17:16 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P25136 and previous config saved to /var/cache/conftool/dbconfig/20220418-171646-ladsgroup.json [production]
17:11 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [production]
17:11 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [production]
17:04 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P25135 and previous config saved to /var/cache/conftool/dbconfig/20220418-170401-ladsgroup.json [production]
17:01 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298565)', diff saved to https://phabricator.wikimedia.org/P25134 and previous config saved to /var/cache/conftool/dbconfig/20220418-170141-ladsgroup.json [production]
17:01 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance [production]
17:01 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance [production]
17:01 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [production]
17:01 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [production]
16:51 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance [production]
16:51 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance [production]
16:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25133 and previous config saved to /var/cache/conftool/dbconfig/20220418-165139-ladsgroup.json [production]