1-50 of 10000 results (31ms)
2021-12-04 §
01:14 <mutante> mx2001 - did not come back from reboot, did not get IP on interface, could not start ferm, logged in via console with root password, in /etc/network/interfaces replaced all "ens5" with "ens13", rebooted again, selected previous kernel version [production]
00:54 <mutante> rebooting mx2001 [production]
00:31 <jynus> manually restarting clamav on otrs1001 after being killed [production]
2021-12-03 §
20:29 <cstone> revision changed from 2c2e22cd to b82183b9 [production]
17:56 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster [production]
17:47 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster [production]
17:47 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster [production]
17:35 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster [production]
17:35 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster [production]
17:35 <razzi@cumin1001> END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. [production]
17:22 <razzi@cumin1001> START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. [production]
16:56 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster [production]
16:56 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster [production]
16:44 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster [production]
16:42 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster [production]
16:42 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster [production]
16:39 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster [production]
16:39 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster [production]
14:25 <jelto@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host gitlab-runner2001.codfw.wmnet [production]
14:10 <jelto@cumin1001> START - Cookbook sre.ganeti.makevm for new host gitlab-runner2001.codfw.wmnet [production]
12:53 <moritzm> installing nss security updates on stretch [production]
12:37 <moritzm> draining primary/secondary instances off ganeti2007 T296622 [production]
12:33 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2022.codfw.wmnet to ganeti01.svc.codfw.wmnet [production]
12:33 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti2022.codfw.wmnet to ganeti01.svc.codfw.wmnet [production]
12:30 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2022.codfw.wmnet [production]
12:26 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti2022.codfw.wmnet [production]
12:13 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2022.codfw.wmnet with OS buster [production]
11:30 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host ganeti2022.codfw.wmnet with OS buster [production]
11:27 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti2011.codfw.wmnet with OS buster [production]
11:08 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host ganeti2011.codfw.wmnet with OS buster [production]
11:06 <jynus> stop and shutdown db1102 T296546 [production]
11:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on ganeti2011.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage [production]
11:01 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ganeti2011.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage [production]
09:38 <moritzm> draining primary/secondary instances off ganeti2011 T296622 [production]
09:25 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2009.codfw.wmnet to ganeti01.svc.codfw.wmnet [production]
09:24 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti2009.codfw.wmnet to ganeti01.svc.codfw.wmnet [production]
09:23 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2009.codfw.wmnet [production]
09:18 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti2009.codfw.wmnet [production]
09:15 <marostegui@cumin1001> dbctl commit (dc=all): 'After maintenance db1161 (T277354)', diff saved to https://phabricator.wikimedia.org/P18019 and previous config saved to /var/cache/conftool/dbconfig/20211203-091537-marostegui.json [production]
09:00 <marostegui@cumin1001> dbctl commit (dc=all): 'After maintenance db1161', diff saved to https://phabricator.wikimedia.org/P18018 and previous config saved to /var/cache/conftool/dbconfig/20211203-090033-marostegui.json [production]
08:58 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2009.codfw.wmnet with OS buster [production]
08:45 <marostegui@cumin1001> dbctl commit (dc=all): 'After maintenance db1161', diff saved to https://phabricator.wikimedia.org/P18017 and previous config saved to /var/cache/conftool/dbconfig/20211203-084528-marostegui.json [production]
08:44 <oblivian@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
08:43 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
08:30 <marostegui@cumin1001> dbctl commit (dc=all): 'After maintenance db1161 (T277354)', diff saved to https://phabricator.wikimedia.org/P18016 and previous config saved to /var/cache/conftool/dbconfig/20211203-083023-marostegui.json [production]
08:30 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host ganeti2009.codfw.wmnet with OS buster [production]
08:29 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1161 (T277354)', diff saved to https://phabricator.wikimedia.org/P18015 and previous config saved to /var/cache/conftool/dbconfig/20211203-082859-marostegui.json [production]
08:28 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db[1154,1161].eqiad.wmnet with reason: Maintenance T277354 [production]
08:28 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db[1154,1161].eqiad.wmnet with reason: Maintenance T277354 [production]
08:28 <marostegui@cumin1001> dbctl commit (dc=all): 'After maintenance db1110 (T277354)', diff saved to https://phabricator.wikimedia.org/P18014 and previous config saved to /var/cache/conftool/dbconfig/20211203-082848-marostegui.json [production]