2022-04-01
§
|
10:06 |
<vgutierrez> |
vgutierrez@puppetmaster2001:~$ sudo -i rm /var/run/confd-template/.ml-staging-ctrl*.err |
[production] |
10:04 |
<vgutierrez> |
vgutierrez@puppetmaster1001:~$ sudo -i rm /var/run/confd-template/.ml-staging-ctrl*.err |
[production] |
10:03 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir5001.eqsin.wmnet |
[production] |
09:57 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir5001.eqsin.wmnet |
[production] |
09:47 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4002.ulsfo.wmnet |
[production] |
09:43 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir4002.ulsfo.wmnet |
[production] |
09:43 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4001.ulsfo.wmnet |
[production] |
09:37 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir4001.ulsfo.wmnet |
[production] |
09:35 |
<vgutierrez@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ncredir3002.esams.wmnet |
[production] |
09:24 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir3002.esams.wmnet |
[production] |
09:24 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir3001.esams.wmnet |
[production] |
09:18 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir3001.esams.wmnet |
[production] |
09:16 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir2002.codfw.wmnet |
[production] |
09:10 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir2002.codfw.wmnet |
[production] |
09:10 |
<vgutierrez@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ncredir2001.codfw.wmnet |
[production] |
08:59 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir2001.codfw.wmnet |
[production] |
08:58 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir1002.eqiad.wmnet |
[production] |
08:54 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir1002.eqiad.wmnet |
[production] |
08:53 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir1001.eqiad.wmnet |
[production] |
08:49 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir1001.eqiad.wmnet |
[production] |
08:48 |
<vgutierrez@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ncredir1001.eqiad.wmnet |
[production] |
08:48 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host ncredir1001.eqiad.wmnet |
[production] |
08:44 |
<vgutierrez@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99) |
[production] |
08:44 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reboot-cluster |
[production] |
08:42 |
<vgutierrez> |
rolling restart of ncredir instances to catch up on kernel upgrades |
[production] |
06:54 |
<XioNoX> |
traffic engineering in drmrs to prevent link saturation |
[production] |
2022-03-31
§
|
23:45 |
<mutante> |
gitlab2001 - fdisk /dev/vdb (g, w) (create partition table), (n, w) (create partition) ; mkfs.ext4 /dev/vdb1 (create filesystem); systemctl reset-failed (fix Icinga alert); mkdir /mnt/gitlab-backup; mount /dev/vdb1 /mnt/gitlab-backup ; blkid (get UUID); edit /etc/fstab and insert "UUID=c5235682-ac21-46a9-85ee-9603f694a6a4 /mnt/gitlab-backup ext4 errors=remount-ro 0 2" T274463 |
[production] |
23:27 |
<mutante> |
gitlab2001 - rebooted on ganeti level (needed when adding new virtual hardware), then ran into the usual bug T272555 where you have to manually fix the interface in /etc/network/interfaces T274463 |
[production] |
23:21 |
<mutante> |
gitlab2001 (gitlab-replica.wikimedia.org) - rebooting to add new virtual disk T274463 |
[production] |
23:11 |
<ejegg> |
updated payments-wiki from 47d9bd27 to 6f888c28 |
[production] |
23:01 |
<bblack> |
esams->drmrs failover test begins - T304089 |
[production] |
22:34 |
<moritzm> |
updated CAS to 6.4.6.2 |
[production] |
22:28 |
<mutante> |
ganeti - creating new 100G virtual disk on gitlab1001 T274463 |
[production] |
22:24 |
<mutante> |
ganeti - creating new 100G virtual disk on gitlab2001 T274463 |
[production] |
22:16 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0) |
[production] |
22:03 |
<bking@cumin1001> |
START - Cookbook sre.wdqs.reboot |
[production] |
22:02 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0) |
[production] |
21:51 |
<bking@cumin1001> |
START - Cookbook sre.wdqs.reboot |
[production] |
21:48 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0) |
[production] |
21:40 |
<bking@cumin1001> |
START - Cookbook sre.wdqs.reboot |
[production] |
21:20 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
21:19 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
21:19 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
21:19 |
<bblack@cumin1001> |
conftool action : set/pooled=yes; selector: name=^(cp1075|cp1079|cp2035|cp3050|cp3051|cp3052|cp3054|cp4022|cp5013|cp5014|cp5015).* |
[production] |
21:18 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
21:17 |
<bblack@cumin1001> |
conftool action : select; selector: name="^(cp1075|cp1079|cp2035|cp3050|cp3051|cp3052|cp3054|cp4022|cp5013|cp5014|cp5015).*" |
[production] |
21:13 |
<catrope@deploy1002> |
Synchronized wmf-config/CommonSettings.php: [[gerrit:775876|Remove unused Flow config]] (duration: 00m 49s) |
[production] |
21:07 |
<bblack@cumin1001> |
conftool action : set/pooled=yes; selector: name=cp5012.eqsin.wmnet |
[production] |
21:07 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0) |
[production] |
21:06 |
<thcipriani> |
utc late backport complete |
[production] |