2022-01-27
ยง
|
10:00 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance |
[production] |
10:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298559)', diff saved to https://phabricator.wikimedia.org/P19416 and previous config saved to /var/cache/conftool/dbconfig/20220127-100007-marostegui.json |
[production] |
10:00 |
<marostegui> |
Failover m1 from db1159 to db1128 - T299624 |
[production] |
09:57 |
<jynus> |
Stopped Bacula Director Daemon service at backup1001 T299624 |
[production] |
09:53 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1027.eqiad.wmnet to ganeti01.svc.eqiad.wmnet |
[production] |
09:53 |
<moritzm> |
added ganeti1027 to Ganeti eqiad cluster T293909 |
[production] |
09:52 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19415 and previous config saved to /var/cache/conftool/dbconfig/20220127-095258-root.json |
[production] |
09:51 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.addnode for new host ganeti1027.eqiad.wmnet to ganeti01.svc.eqiad.wmnet |
[production] |
09:50 |
<hnowlan@deploy1002> |
Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 14s) |
[production] |
09:50 |
<hnowlan@deploy1002> |
Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) |
[production] |
09:47 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet |
[production] |
09:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19414 and previous config saved to /var/cache/conftool/dbconfig/20220127-094651-marostegui.json |
[production] |
09:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19413 and previous config saved to /var/cache/conftool/dbconfig/20220127-094502-marostegui.json |
[production] |
09:41 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet |
[production] |
09:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 60%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19412 and previous config saved to /var/cache/conftool/dbconfig/20220127-093755-root.json |
[production] |
09:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19411 and previous config saved to /var/cache/conftool/dbconfig/20220127-093146-marostegui.json |
[production] |
09:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19410 and previous config saved to /var/cache/conftool/dbconfig/20220127-092957-marostegui.json |
[production] |
09:27 |
<filippo@puppetmaster1001> |
conftool action : set/weight=10; selector: name=prometheus2005.codfw.wmnet |
[production] |
09:27 |
<filippo@puppetmaster1001> |
conftool action : set/weight=10; selector: name=prometheus2006.codfw.wmnet |
[production] |
09:23 |
<root@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2078,2132].codfw.wmnet,db[1117,1128,1159].eqiad.wmnet with reason: Primary switchover m1 T299624 |
[production] |
09:23 |
<root@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on db[2078,2132].codfw.wmnet,db[1117,1128,1159].eqiad.wmnet with reason: Primary switchover m1 T299624 |
[production] |
09:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19409 and previous config saved to /var/cache/conftool/dbconfig/20220127-092251-root.json |
[production] |
09:18 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1007.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage |
[production] |
09:18 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1007.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage |
[production] |
09:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112 (T285149)', diff saved to https://phabricator.wikimedia.org/P19408 and previous config saved to /var/cache/conftool/dbconfig/20220127-091641-marostegui.json |
[production] |
09:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298559)', diff saved to https://phabricator.wikimedia.org/P19407 and previous config saved to /var/cache/conftool/dbconfig/20220127-091453-marostegui.json |
[production] |
09:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1170:3317 (T298559)', diff saved to https://phabricator.wikimedia.org/P19406 and previous config saved to /var/cache/conftool/dbconfig/20220127-091440-marostegui.json |
[production] |
09:14 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance |
[production] |
09:14 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance |
[production] |
09:14 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance |
[production] |
09:14 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance |
[production] |
09:14 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance |
[production] |
09:14 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance |
[production] |
09:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298559)', diff saved to https://phabricator.wikimedia.org/P19405 and previous config saved to /var/cache/conftool/dbconfig/20220127-091401-marostegui.json |
[production] |
09:07 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 40%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19404 and previous config saved to /var/cache/conftool/dbconfig/20220127-090747-root.json |
[production] |
08:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19403 and previous config saved to /var/cache/conftool/dbconfig/20220127-085857-marostegui.json |
[production] |
08:52 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 25%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19402 and previous config saved to /var/cache/conftool/dbconfig/20220127-085244-root.json |
[production] |
08:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19401 and previous config saved to /var/cache/conftool/dbconfig/20220127-084352-marostegui.json |
[production] |
08:41 |
<jayme@deploy1002> |
Finished deploy [restbase/deploy@0848b15]: scap testing (duration: 00m 05s) |
[production] |
08:40 |
<jayme@deploy1002> |
Started deploy [restbase/deploy@0848b15]: scap testing |
[production] |
08:38 |
<jayme> |
updated scap to 4.2.1 on A:mw-canary, A:parsoid-canary, A:mw-jobrunner-canary, A:restbase-canary - T300058 |
[production] |
08:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 20%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19400 and previous config saved to /var/cache/conftool/dbconfig/20220127-083740-root.json |
[production] |
08:33 |
<jayme> |
uploaded scap 4.2.1 to apt.wikimedia.org - T300058 |
[production] |
08:28 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298559)', diff saved to https://phabricator.wikimedia.org/P19399 and previous config saved to /var/cache/conftool/dbconfig/20220127-082847-marostegui.json |
[production] |
08:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1127 (T298559)', diff saved to https://phabricator.wikimedia.org/P19398 and previous config saved to /var/cache/conftool/dbconfig/20220127-082735-marostegui.json |
[production] |
08:27 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance |
[production] |
08:27 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance |
[production] |
08:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298559)', diff saved to https://phabricator.wikimedia.org/P19397 and previous config saved to /var/cache/conftool/dbconfig/20220127-082728-marostegui.json |
[production] |
08:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1131 (re)pooling @ 10%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19396 and previous config saved to /var/cache/conftool/dbconfig/20220127-082236-root.json |
[production] |
08:21 |
<jayme@deploy1002> |
helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. |
[production] |