2025-08-05
ยง
|
21:37 |
<jclark@cumin1002> |
START - Cookbook sre.hosts.reimage for host dbprov1007.eqiad.wmnet with OS bookworm |
[production] |
21:31 |
<dani@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/miscweb: apply |
[production] |
21:31 |
<dani@deploy1003> |
helmfile [codfw] START helmfile.d/services/miscweb: apply |
[production] |
21:31 |
<dani@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/miscweb: apply |
[production] |
21:31 |
<dani@deploy1003> |
helmfile [eqiad] START helmfile.d/services/miscweb: apply |
[production] |
21:31 |
<dani@deploy1003> |
helmfile [staging] DONE helmfile.d/services/miscweb: apply |
[production] |
21:31 |
<dani@deploy1003> |
helmfile [staging] START helmfile.d/services/miscweb: apply |
[production] |
21:31 |
<jclark@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1007.eqiad.wmnet with OS bookworm |
[production] |
21:27 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P80856 and previous config saved to /var/cache/conftool/dbconfig/20250805-212701-fceratto.json |
[production] |
21:25 |
<brett@dns1004> |
END - running authdns-update |
[production] |
21:25 |
<dani@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/miscweb: apply |
[production] |
21:25 |
<dani@deploy1003> |
helmfile [codfw] START helmfile.d/services/miscweb: apply |
[production] |
21:24 |
<dani@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/miscweb: apply |
[production] |
21:24 |
<dani@deploy1003> |
helmfile [eqiad] START helmfile.d/services/miscweb: apply |
[production] |
21:24 |
<dani@deploy1003> |
helmfile [staging] DONE helmfile.d/services/miscweb: apply |
[production] |
21:24 |
<dani@deploy1003> |
helmfile [staging] START helmfile.d/services/miscweb: apply |
[production] |
21:23 |
<bking@cumin2002> |
conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad |
[production] |
21:22 |
<brett@dns1004> |
START - running authdns-update |
[production] |
21:18 |
<bking@cumin2002> |
START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards |
[production] |
21:17 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards |
[production] |
21:17 |
<bking@cumin2002> |
START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards |
[production] |
21:14 |
<ryankemper@cumin2002> |
END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97) reloading scholarly_articles on wdqs1024.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/scholarly/20250714/ using stat1009.eqiad.wmnet) |
[production] |
21:11 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1203 (T399728)', diff saved to https://phabricator.wikimedia.org/P80855 and previous config saved to /var/cache/conftool/dbconfig/20250805-211153-fceratto.json |
[production] |
21:07 |
<jclark@cumin1002> |
START - Cookbook sre.hosts.reimage for host dbprov1007.eqiad.wmnet with OS bookworm |
[production] |
21:06 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db1203 (T399728)', diff saved to https://phabricator.wikimedia.org/P80854 and previous config saved to /var/cache/conftool/dbconfig/20250805-210649-fceratto.json |
[production] |
21:06 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1203.eqiad.wmnet with reason: Maintenance |
[production] |
21:06 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80853 and previous config saved to /var/cache/conftool/dbconfig/20250805-210627-fceratto.json |
[production] |
20:51 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P80852 and previous config saved to /var/cache/conftool/dbconfig/20250805-205119-fceratto.json |
[production] |
20:46 |
<bking@cumin2002> |
START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2010.codfw.wmnet w/ force delete existing files, repooling both afterwards |
[production] |
20:40 |
<jhancock@cumin1003> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov2007.codfw.wmnet with OS bookworm |
[production] |
20:36 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P80851 and previous config saved to /var/cache/conftool/dbconfig/20250805-203612-fceratto.json |
[production] |
20:35 |
<ebernhardson> |
starting cluster mutation test on relforge*.eqiad.wmnet servers |
[production] |
20:21 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80850 and previous config saved to /var/cache/conftool/dbconfig/20250805-202104-fceratto.json |
[production] |
20:20 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2008.codfw.wmnet w/ force delete existing files, repooling both afterwards |
[production] |
20:19 |
<jclark@cumin1002> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED |
[production] |
20:16 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80849 and previous config saved to /var/cache/conftool/dbconfig/20250805-201601-fceratto.json |
[production] |
20:15 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance |
[production] |
20:15 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80848 and previous config saved to /var/cache/conftool/dbconfig/20250805-201539-fceratto.json |
[production] |
20:00 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P80847 and previous config saved to /var/cache/conftool/dbconfig/20250805-200031-fceratto.json |
[production] |
19:49 |
<jclark@cumin1002> |
START - Cookbook sre.hosts.provision for host dbprov1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED |
[production] |
19:49 |
<mutante> |
[gitlab2002:~] $ sudo systemctl start wmf_auto_restart_ssh-gitlab T401191 |
[production] |
19:47 |
<jclark@cumin1002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
19:47 |
<jclark@cumin1002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for dbprov1007 - jclark@cumin1002" |
[production] |
19:47 |
<jclark@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for dbprov1007 - jclark@cumin1002" |
[production] |
19:45 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P80846 and previous config saved to /var/cache/conftool/dbconfig/20250805-194524-fceratto.json |
[production] |
19:39 |
<jclark@cumin1002> |
START - Cookbook sre.dns.netbox |
[production] |
19:30 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80845 and previous config saved to /var/cache/conftool/dbconfig/20250805-193016-fceratto.json |
[production] |
19:24 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80844 and previous config saved to /var/cache/conftool/dbconfig/20250805-192410-fceratto.json |
[production] |
19:24 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1178.eqiad.wmnet with reason: Maintenance |
[production] |
19:23 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1177 (T399728)', diff saved to https://phabricator.wikimedia.org/P80843 and previous config saved to /var/cache/conftool/dbconfig/20250805-192347-fceratto.json |
[production] |