8901-8950 of 10000 results (167ms)
2023-06-05 §
06:20 <_joe_> killing a pod with consistently high haproxy queue for thumbor in codfw [production]
06:16 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 60427 [production]
06:15 <ayounsi@cumin1001> START - Cookbook sre.network.peering with action 'configure' for AS: 60427 [production]
2023-06-03 §
13:41 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade [production]
13:41 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade [production]
13:28 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet [production]
13:28 <bking@cumin1001> START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet [production]
2023-06-02 §
20:16 <apergos> rsync in ariel screen session, bwlimit 100000, running on dumpsdata1003, pulling from dumpsdata1002, copying over 'other dumps' [production]
18:42 <bblack> dns*: puppets are all re-enabled, ntp restarts are done, etc [production]
17:48 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
17:48 <pt1979@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002" [production]
17:47 <pt1979@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002" [production]
17:45 <pt1979@cumin2002> START - Cookbook sre.dns.netbox [production]
17:45 <pt1979@cumin2002> START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet [production]
17:27 <bblack> dns*: disabling puppet to control rollout of NTP config fixups [production]
16:03 <bblack> dns*: removed faulty authdns[12]001 lines from /etc/hosts via cumin+sed [production]
15:35 <sukhe> restart ntp.service on dns1002 [production]
13:26 <otto@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
13:26 <otto@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
13:25 <otto@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
13:25 <otto@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
13:25 <ottomata> deploying flink-operator change to dse-k8s and wikikube to add ingress for health check port - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/926479 [production]
13:24 <otto@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
13:24 <otto@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
13:24 <otto@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
13:24 <otto@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
13:22 <otto@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
13:22 <otto@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
12:03 <moritzm> installing at-spi2-core bugfix updates from Bullseye point release [production]
09:35 <moritzm> installing texlive-security updates on buster [production]
09:18 <akosiaris> update kubernetes-node to 1.23.14-2 on all P:kubernetes::node hosts (88 in total) T337836. Reload systemd for unit changes to take effect [production]
08:52 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5016.eqsin.wmnet [production]
08:52 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5016.eqsin.wmnet [production]
08:52 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5015.eqsin.wmnet [production]
08:51 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5015.eqsin.wmnet [production]
08:51 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5014.eqsin.wmnet [production]
08:51 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5014.eqsin.wmnet [production]
08:51 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5013.eqsin.wmnet [production]
08:51 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5013.eqsin.wmnet [production]
08:51 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 0 hosts: [production]
08:51 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 0 hosts: [production]
08:42 <moritzm> installing traceroute bugfix updates from Bullseye point release [production]
07:53 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6002.wikimedia.org [production]
07:47 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host bast6002.wikimedia.org [production]
07:42 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3006.wikimedia.org [production]
07:36 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host bast3006.wikimedia.org [production]
07:30 <mvernon@cumin2002> END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad) [production]
07:28 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org [production]
07:22 <mvernon@cumin2002> START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad) [production]
07:21 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org [production]