2022-01-27
ยง
|
10:47 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es[2024-2025].codfw.wmnet with reason: Reimage of the master T300006 |
[production] |
10:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1158 (T298559)', diff saved to https://phabricator.wikimedia.org/P19426 and previous config saved to /var/cache/conftool/dbconfig/20220127-104654-marostegui.json |
[production] |
10:46 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance |
[production] |
10:46 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance |
[production] |
10:46 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance |
[production] |
10:46 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance |
[production] |
10:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298559)', diff saved to https://phabricator.wikimedia.org/P19425 and previous config saved to /var/cache/conftool/dbconfig/20220127-104641-marostegui.json |
[production] |
10:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1166 (T285149)', diff saved to https://phabricator.wikimedia.org/P19424 and previous config saved to /var/cache/conftool/dbconfig/20220127-104618-marostegui.json |
[production] |
10:38 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.reimage for host db1159.eqiad.wmnet with OS bullseye |
[production] |
10:35 |
<Amir1> |
creating linktarget table everywhere (T299416) |
[production] |
10:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19423 and previous config saved to /var/cache/conftool/dbconfig/20220127-103136-marostegui.json |
[production] |
10:20 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1166 (T285149)', diff saved to https://phabricator.wikimedia.org/P19422 and previous config saved to /var/cache/conftool/dbconfig/20220127-102049-marostegui.json |
[production] |
10:20 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance |
[production] |
10:20 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance |
[production] |
10:17 |
<jynus> |
Started Bacula Director Daemon service at backup1001 T299624 |
[production] |
10:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19421 and previous config saved to /var/cache/conftool/dbconfig/20220127-101631-marostegui.json |
[production] |
10:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19420 and previous config saved to /var/cache/conftool/dbconfig/20220127-100802-root.json |
[production] |
10:02 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance |
[production] |
10:02 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance |
[production] |
10:02 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance |
[production] |
10:02 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance |
[production] |
10:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112 (T285149)', diff saved to https://phabricator.wikimedia.org/P19419 and previous config saved to /var/cache/conftool/dbconfig/20220127-100155-marostegui.json |
[production] |
10:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298559)', diff saved to https://phabricator.wikimedia.org/P19418 and previous config saved to /var/cache/conftool/dbconfig/20220127-100127-marostegui.json |
[production] |
10:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1174 (T298559)', diff saved to https://phabricator.wikimedia.org/P19417 and previous config saved to /var/cache/conftool/dbconfig/20220127-100014-marostegui.json |
[production] |
10:00 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance |
[production] |
10:00 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance |
[production] |
10:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298559)', diff saved to https://phabricator.wikimedia.org/P19416 and previous config saved to /var/cache/conftool/dbconfig/20220127-100007-marostegui.json |
[production] |
10:00 |
<marostegui> |
Failover m1 from db1159 to db1128 - T299624 |
[production] |
09:57 |
<jynus> |
Stopped Bacula Director Daemon service at backup1001 T299624 |
[production] |
09:53 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1027.eqiad.wmnet to ganeti01.svc.eqiad.wmnet |
[production] |
09:53 |
<moritzm> |
added ganeti1027 to Ganeti eqiad cluster T293909 |
[production] |
09:52 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19415 and previous config saved to /var/cache/conftool/dbconfig/20220127-095258-root.json |
[production] |
09:51 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.addnode for new host ganeti1027.eqiad.wmnet to ganeti01.svc.eqiad.wmnet |
[production] |
09:50 |
<hnowlan@deploy1002> |
Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 14s) |
[production] |
09:50 |
<hnowlan@deploy1002> |
Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) |
[production] |
09:47 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet |
[production] |
09:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19414 and previous config saved to /var/cache/conftool/dbconfig/20220127-094651-marostegui.json |
[production] |
09:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19413 and previous config saved to /var/cache/conftool/dbconfig/20220127-094502-marostegui.json |
[production] |
09:41 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet |
[production] |
09:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 60%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19412 and previous config saved to /var/cache/conftool/dbconfig/20220127-093755-root.json |
[production] |
09:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19411 and previous config saved to /var/cache/conftool/dbconfig/20220127-093146-marostegui.json |
[production] |
09:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19410 and previous config saved to /var/cache/conftool/dbconfig/20220127-092957-marostegui.json |
[production] |
09:27 |
<filippo@puppetmaster1001> |
conftool action : set/weight=10; selector: name=prometheus2005.codfw.wmnet |
[production] |
09:27 |
<filippo@puppetmaster1001> |
conftool action : set/weight=10; selector: name=prometheus2006.codfw.wmnet |
[production] |
09:23 |
<root@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2078,2132].codfw.wmnet,db[1117,1128,1159].eqiad.wmnet with reason: Primary switchover m1 T299624 |
[production] |
09:23 |
<root@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on db[2078,2132].codfw.wmnet,db[1117,1128,1159].eqiad.wmnet with reason: Primary switchover m1 T299624 |
[production] |
09:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19409 and previous config saved to /var/cache/conftool/dbconfig/20220127-092251-root.json |
[production] |
09:18 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1007.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage |
[production] |
09:18 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1007.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage |
[production] |
09:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112 (T285149)', diff saved to https://phabricator.wikimedia.org/P19408 and previous config saved to /var/cache/conftool/dbconfig/20220127-091641-marostegui.json |
[production] |