1-50 of 10000 results (129ms)
2026-05-08 ยง
20:32 <vriley@cumin1003> START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED [production]
20:31 <vriley@cumin1003> END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265 [production]
20:30 <vriley@cumin1003> START - Cookbook sre.network.configure-switch-interfaces for host db1265 [production]
20:29 <vriley@cumin1003> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
20:29 <vriley@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003" [production]
20:29 <vriley@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003" [production]
20:24 <vriley@cumin1003> START - Cookbook sre.dns.netbox [production]
20:01 <ryankemper> [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/ [production]
19:42 <jelto@deploy1003> helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply [production]
19:41 <jelto@deploy1003> helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply [production]
19:41 <jelto@deploy1003> helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply [production]
19:40 <jelto@deploy1003> helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply [production]
18:07 <ryankemper> [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health [production]
17:53 <ryankemper> [WDQS] Deployed 2 new requestctl rules; we'll see if it helps [production]
16:51 <topranks> enable bfd on system0.0 sub-interface ssw1-d1-eqiad [production]
15:45 <jynus@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart [production]
15:37 <jynus@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart [production]
14:53 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet [production]
14:47 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet [production]
14:07 <fceratto@deploy1003> helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . [production]
10:51 <btullis> re-pooled wdqs-main in eqiad for T425758 [production]
10:50 <btullis@cumin1003> conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad [production]
10:15 <jelto@deploy1003> helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply [production]
10:15 <jelto@deploy1003> helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply [production]
10:15 <jelto@deploy1003> helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply [production]
10:15 <jelto@deploy1003> helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply [production]
10:14 <jynus@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart [production]
10:12 <jelto@deploy1003> helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply [production]
10:12 <jelto@deploy1003> helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply [production]
10:11 <jelto@deploy1003> helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply [production]
10:11 <jelto@deploy1003> helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply [production]
10:09 <jelto@deploy1003> helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply [production]
10:09 <jelto@deploy1003> helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply [production]
09:44 <btullis> depooled wdqs-main in eqiad for T425758 [production]
09:41 <jelto@deploy1003> helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply [production]
09:41 <jelto@deploy1003> helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply [production]
09:41 <jelto@deploy1003> helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply [production]
09:41 <jelto@deploy1003> helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply [production]
09:40 <jelto@deploy1003> helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply [production]
09:40 <jelto@deploy1003> helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply [production]
09:40 <btullis@cumin1003> conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad [production]
09:36 <jelto@deploy1003> helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply [production]
09:36 <jelto@deploy1003> helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply [production]
09:36 <jelto@deploy1003> helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply [production]
09:35 <jelto@deploy1003> helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply [production]
09:32 <fceratto@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1189 (T419635)', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json [production]
09:22 <fceratto@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json [production]
09:12 <fceratto@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json [production]
09:02 <fceratto@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1189 (T419635)', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json [production]
08:52 <fceratto@cumin1003> dbctl commit (dc=all): 'Depooling db1189 (T419635)', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json [production]