301-350 of 10000 results (43ms)
2022-10-03 ยง
19:37 <ryankemper> [Elastic] Restarted psi on `elastic1066`; will unban host after process is up and running [production]
19:32 <robh> msw1-ulsfo swap successful, mgmt recovering in icinga and tested connection with 3 servers all work [production]
19:25 <robh> msw1-ulsfo swap, some mgmt flapping expected, swap complete but not powered back up yet [production]
19:22 <ryankemper> [Elastic] Banned `elastic1066` (`curl -H 'Content-Type: application/json' -XPUT http://localhost:9600/_cluster/settings -d '{"transient":{"cluster.routing.allocation.exclude":{"_host": "","_name": "elastic1066-production-search-psi-eqiad"}}}'`); will restart elasticsearch-psi after shards drain [production]
19:15 <robh@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS bullseye [production]
18:48 <robh@cumin2002> START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye [production]
18:41 <robh@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS bullseye [production]
18:34 <robh@cumin2002> START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye [production]
18:30 <robh@cumin2002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED [production]
18:30 <bblack@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4045.ulsfo.wmnet with OS buster [production]
18:21 <robh@cumin2002> START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED [production]
18:12 <robh@cumin2002> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED [production]
18:06 <robh@cumin2002> START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED [production]
18:04 <robh@cumin2002> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED [production]
18:00 <robh@cumin2002> START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED [production]
17:52 <robh@cumin2002> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED [production]
17:42 <robh@cumin2002> START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED [production]
17:41 <robh@cumin2002> END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns4003 [production]
17:41 <robh@cumin2002> START - Cookbook sre.network.configure-switch-interfaces for host dns4003 [production]
17:40 <robh@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
17:37 <robh@cumin2002> START - Cookbook sre.dns.netbox [production]
17:29 <bblack@cumin1001> START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS buster [production]
17:29 <sukhe> running homer "cr*-ulsfo*" commit "Gerrit 837727: remove dns4001 for anycast neighbors." [production]
17:13 <robh@cumin2002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dns4001.wikimedia.org [production]
17:13 <robh@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
17:08 <robh@cumin2002> START - Cookbook sre.dns.netbox [production]
17:04 <robh@cumin2002> START - Cookbook sre.hosts.decommission for hosts dns4001.wikimedia.org [production]
16:43 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
16:39 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
16:39 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
16:34 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
16:33 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 30781 [production]
16:33 <ayounsi@cumin1001> START - Cookbook sre.network.peering with action 'configure' for AS: 30781 [production]
16:29 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
16:28 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
16:28 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
16:27 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
16:24 <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:837696|throttle: Remove out of date rules]] (duration: 04m 16s) [production]
16:22 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
16:21 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
16:21 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
16:20 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
16:20 <urbanecm@deploy1002> urbanecm and urbanecm: Backport for [[gerrit:837696|throttle: Remove out of date rules]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet [production]
16:20 <urbanecm@deploy1002> Started scap: Backport for [[gerrit:837696|throttle: Remove out of date rules]] [production]
16:18 <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: cae49b85d2d780e34b553789d56d76bac4a62c48: throttle: Add throttle rule for 2022-10-06 (T319212) (duration: 04m 21s) [production]
16:14 <sukhe> disable Puppet on cp hosts in codfw: rolling out T309651 [production]
15:15 <sukhe> disable Puppet on cp hosts in ulsfo: rolling out T309651 [production]
15:14 <marostegui@cumin1001> dbctl commit (dc=all): 'db2123 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35320 and previous config saved to /var/cache/conftool/dbconfig/20221003-151438-root.json [production]
15:06 <papaul> maintenance complete on mr1-esams [production]
14:59 <marostegui@cumin1001> dbctl commit (dc=all): 'db2123 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35319 and previous config saved to /var/cache/conftool/dbconfig/20221003-145933-root.json [production]