production SAL

651-700 of 10000 results (89ms)

2023-05-02 §
17:25	<sukhe>	ns0 set routing-options static route 208.80.154.238/32 next-hop [ 208.80.154.10 208.80.155.108 208.80.154.134 ]: T330670	[production]
17:03	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1001.eqiad.wmnet	[production]
16:58	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet	[production]
16:38	<btullis@cumin1001>	END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.	[production]
16:24	<btullis@cumin1001>	START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.	[production]
16:16	<ebernhardson@deploy1002>	Finished deploy [search/mjolnir/deploy@bb96aca]: Add snappy dependency for kafka daemons (duration: 00m 26s)	[production]
16:16	<sukhe>	ns0 backup routes: delete routing-options static route 208.80.154.238/32 next-hop 208.80.153.111, set to 208.80.153.77	[production]
16:16	<ebernhardson@deploy1002>	Started deploy [search/mjolnir/deploy@bb96aca]: Add snappy dependency for kafka daemons	[production]
16:12	<sukhe>	ns1: delete routing-options static route 208.80.153.231/32 next-hop 208.80.153.111, set to 208.80.153.77	[production]
16:11	<hnowlan@puppetmaster1001>	conftool action : set/pooled=yes; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet	[production]
16:10	<hnowlan@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/thumbor: sync	[production]
16:08	<hnowlan@deploy1002>	helmfile [eqiad] START helmfile.d/services/thumbor: sync	[production]
16:06	<hnowlan@puppetmaster1001>	conftool action : set/pooled=no; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet	[production]
15:39	<jclark@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:38	<jclark@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:36	<claime>	Re-running puppet on failed parse servers - T313227	[production]
15:35	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye	[production]
15:35	<jclark@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:34	<bking@cumin1001>	START - Cookbook sre.hosts.downtime for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye	[production]
15:33	<jclark@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:16	<jiji@cumin1001>	END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: codfw row C switches upgrade - T334049	[production]
15:13	<jclark@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:12	<jclark@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:09	<hnowlan@deploy1002>	helmfile [eqiad] START helmfile.d/services/thumbor: apply	[production]
15:04	<claime>	enabling puppet on parse2014	[production]
15:04	<claime>	enabling puppet on parse2013	[production]
15:02	<akosiaris>	enable puppet on parse1005	[production]
15:00	<jclark@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:00	<hnowlan@deploy1002>	helmfile [eqiad] START helmfile.d/services/thumbor: apply	[production]
14:59	<jiji@cumin1001>	START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: codfw row C switches upgrade - T334049	[production]
14:59	<jclark@cumin1001>	START - Cookbook sre.dns.netbox	[production]
14:58	<hnowlan@deploy1002>	helmfile [codfw] DONE helmfile.d/services/thumbor: apply	[production]
14:56	<jclark@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
14:55	<hnowlan@deploy1002>	helmfile [codfw] START helmfile.d/services/thumbor: apply	[production]
14:54	<jclark@cumin1001>	START - Cookbook sre.dns.netbox	[production]
14:53	<hnowlan@deploy1002>	helmfile [staging] DONE helmfile.d/services/thumbor: apply	[production]
14:52	<hnowlan@deploy1002>	helmfile [staging] START helmfile.d/services/thumbor: apply	[production]
14:40	<moritzm>	installing intel-microcode security updates on bullseye servers	[production]
14:40	<akosiaris>	emergency disabling of puppet on parse hosts	[production]
14:33	<akosiaris@deploy1002>	helmfile [staging] DONE helmfile.d/services/machinetranslation: sync	[production]
14:33	<claime>	Merging new internal certs for api, jobrunner, appservers, parsoid - T313227	[production]
14:29	<akosiaris@deploy1002>	helmfile [staging] START helmfile.d/services/machinetranslation: sync	[production]
14:27	<denisse>	sync prometheus3001 -> prometheus3002	[production]
14:27	<akosiaris@deploy1002>	helmfile [staging] DONE helmfile.d/services/machinetranslation: apply	[production]
14:23	<_joe_>	also on contint1002, the current ci master	[production]
14:22	<_joe_>	restarted zuul on contint2001	[production]
14:07	<akosiaris@deploy1002>	helmfile [staging] START helmfile.d/services/machinetranslation: apply	[production]
13:51	<sukhe>	run authdns-update to repool codfw	[production]
13:47	<cgoubert@cumin1001>	conftool action : set/pooled=yes; selector: dc=codfw,cluster=parsoid	[production]
13:47	<cgoubert@cumin1001>	conftool action : set/pooled=yes; selector: dc=codfw,cluster=appserver	[production]