production SAL

8901-8950 of 10000 results (102ms)

2023-06-05 §
06:20	<_joe_>	killing a pod with consistently high haproxy queue for thumbor in codfw	[production]
06:16	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 60427	[production]
06:15	<ayounsi@cumin1001>	START - Cookbook sre.network.peering with action 'configure' for AS: 60427	[production]
2023-06-03 §
13:41	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade	[production]
13:41	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade	[production]
13:28	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet	[production]
13:28	<bking@cumin1001>	START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet	[production]
2023-06-02 §
20:16	<apergos>	rsync in ariel screen session, bwlimit 100000, running on dumpsdata1003, pulling from dumpsdata1002, copying over 'other dumps'	[production]
18:42	<bblack>	dns*: puppets are all re-enabled, ntp restarts are done, etc	[production]
17:48	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
17:48	<pt1979@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"	[production]
17:47	<pt1979@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"	[production]
17:45	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
17:45	<pt1979@cumin2002>	START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet	[production]
17:27	<bblack>	dns*: disabling puppet to control rollout of NTP config fixups	[production]
16:03	<bblack>	dns*: removed faulty authdns[12]001 lines from /etc/hosts via cumin+sed	[production]
15:35	<sukhe>	restart ntp.service on dns1002	[production]
13:26	<otto@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
13:26	<otto@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
13:25	<otto@deploy1002>	helmfile [codfw] DONE helmfile.d/admin 'apply'.	[production]
13:25	<otto@deploy1002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
13:25	<ottomata>	deploying flink-operator change to dse-k8s and wikikube to add ingress for health check port - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/926479	[production]
13:24	<otto@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
13:24	<otto@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
13:24	<otto@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
13:24	<otto@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
13:22	<otto@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
13:22	<otto@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
12:03	<moritzm>	installing at-spi2-core bugfix updates from Bullseye point release	[production]
09:35	<moritzm>	installing texlive-security updates on buster	[production]
09:18	<akosiaris>	update kubernetes-node to 1.23.14-2 on all P:kubernetes::node hosts (88 in total) T337836. Reload systemd for unit changes to take effect	[production]
08:52	<jmm@cumin2002>	END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5016.eqsin.wmnet	[production]
08:52	<jmm@cumin2002>	START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5016.eqsin.wmnet	[production]
08:52	<jmm@cumin2002>	END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5015.eqsin.wmnet	[production]
08:51	<jmm@cumin2002>	START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5015.eqsin.wmnet	[production]
08:51	<jmm@cumin2002>	END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5014.eqsin.wmnet	[production]
08:51	<jmm@cumin2002>	START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5014.eqsin.wmnet	[production]
08:51	<jmm@cumin2002>	END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5013.eqsin.wmnet	[production]
08:51	<jmm@cumin2002>	START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5013.eqsin.wmnet	[production]
08:51	<jmm@cumin2002>	END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 0 hosts:	[production]
08:51	<jmm@cumin2002>	START - Cookbook sre.debmonitor.remove-hosts for 0 hosts:	[production]
08:42	<moritzm>	installing traceroute bugfix updates from Bullseye point release	[production]
07:53	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6002.wikimedia.org	[production]
07:47	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host bast6002.wikimedia.org	[production]
07:42	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3006.wikimedia.org	[production]
07:36	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host bast3006.wikimedia.org	[production]
07:30	<mvernon@cumin2002>	END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)	[production]
07:28	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org	[production]
07:22	<mvernon@cumin2002>	START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)	[production]
07:21	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org	[production]