601-650 of 10000 results (128ms)
2025-08-05 ยง
21:37 <jclark@cumin1002> START - Cookbook sre.hosts.reimage for host dbprov1007.eqiad.wmnet with OS bookworm [production]
21:31 <dani@deploy1003> helmfile [codfw] DONE helmfile.d/services/miscweb: apply [production]
21:31 <dani@deploy1003> helmfile [codfw] START helmfile.d/services/miscweb: apply [production]
21:31 <dani@deploy1003> helmfile [eqiad] DONE helmfile.d/services/miscweb: apply [production]
21:31 <dani@deploy1003> helmfile [eqiad] START helmfile.d/services/miscweb: apply [production]
21:31 <dani@deploy1003> helmfile [staging] DONE helmfile.d/services/miscweb: apply [production]
21:31 <dani@deploy1003> helmfile [staging] START helmfile.d/services/miscweb: apply [production]
21:31 <jclark@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1007.eqiad.wmnet with OS bookworm [production]
21:27 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P80856 and previous config saved to /var/cache/conftool/dbconfig/20250805-212701-fceratto.json [production]
21:25 <brett@dns1004> END - running authdns-update [production]
21:25 <dani@deploy1003> helmfile [codfw] DONE helmfile.d/services/miscweb: apply [production]
21:25 <dani@deploy1003> helmfile [codfw] START helmfile.d/services/miscweb: apply [production]
21:24 <dani@deploy1003> helmfile [eqiad] DONE helmfile.d/services/miscweb: apply [production]
21:24 <dani@deploy1003> helmfile [eqiad] START helmfile.d/services/miscweb: apply [production]
21:24 <dani@deploy1003> helmfile [staging] DONE helmfile.d/services/miscweb: apply [production]
21:24 <dani@deploy1003> helmfile [staging] START helmfile.d/services/miscweb: apply [production]
21:23 <bking@cumin2002> conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad [production]
21:22 <brett@dns1004> START - running authdns-update [production]
21:18 <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards [production]
21:17 <bking@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards [production]
21:17 <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards [production]
21:14 <ryankemper@cumin2002> END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97) reloading scholarly_articles on wdqs1024.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/scholarly/20250714/ using stat1009.eqiad.wmnet) [production]
21:11 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1203 (T399728)', diff saved to https://phabricator.wikimedia.org/P80855 and previous config saved to /var/cache/conftool/dbconfig/20250805-211153-fceratto.json [production]
21:07 <jclark@cumin1002> START - Cookbook sre.hosts.reimage for host dbprov1007.eqiad.wmnet with OS bookworm [production]
21:06 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1203 (T399728)', diff saved to https://phabricator.wikimedia.org/P80854 and previous config saved to /var/cache/conftool/dbconfig/20250805-210649-fceratto.json [production]
21:06 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1203.eqiad.wmnet with reason: Maintenance [production]
21:06 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80853 and previous config saved to /var/cache/conftool/dbconfig/20250805-210627-fceratto.json [production]
20:51 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P80852 and previous config saved to /var/cache/conftool/dbconfig/20250805-205119-fceratto.json [production]
20:46 <bking@cumin2002> START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2010.codfw.wmnet w/ force delete existing files, repooling both afterwards [production]
20:40 <jhancock@cumin1003> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov2007.codfw.wmnet with OS bookworm [production]
20:36 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P80851 and previous config saved to /var/cache/conftool/dbconfig/20250805-203612-fceratto.json [production]
20:35 <ebernhardson> starting cluster mutation test on relforge*.eqiad.wmnet servers [production]
20:21 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80850 and previous config saved to /var/cache/conftool/dbconfig/20250805-202104-fceratto.json [production]
20:20 <bking@cumin2002> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2008.codfw.wmnet w/ force delete existing files, repooling both afterwards [production]
20:19 <jclark@cumin1002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED [production]
20:16 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80849 and previous config saved to /var/cache/conftool/dbconfig/20250805-201601-fceratto.json [production]
20:15 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance [production]
20:15 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80848 and previous config saved to /var/cache/conftool/dbconfig/20250805-201539-fceratto.json [production]
20:00 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P80847 and previous config saved to /var/cache/conftool/dbconfig/20250805-200031-fceratto.json [production]
19:49 <jclark@cumin1002> START - Cookbook sre.hosts.provision for host dbprov1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED [production]
19:49 <mutante> [gitlab2002:~] $ sudo systemctl start wmf_auto_restart_ssh-gitlab T401191 [production]
19:47 <jclark@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
19:47 <jclark@cumin1002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for dbprov1007 - jclark@cumin1002" [production]
19:47 <jclark@cumin1002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for dbprov1007 - jclark@cumin1002" [production]
19:45 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P80846 and previous config saved to /var/cache/conftool/dbconfig/20250805-194524-fceratto.json [production]
19:39 <jclark@cumin1002> START - Cookbook sre.dns.netbox [production]
19:30 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80845 and previous config saved to /var/cache/conftool/dbconfig/20250805-193016-fceratto.json [production]
19:24 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80844 and previous config saved to /var/cache/conftool/dbconfig/20250805-192410-fceratto.json [production]
19:24 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1178.eqiad.wmnet with reason: Maintenance [production]
19:23 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1177 (T399728)', diff saved to https://phabricator.wikimedia.org/P80843 and previous config saved to /var/cache/conftool/dbconfig/20250805-192347-fceratto.json [production]