production SAL

1551-1600 of 10000 results (42ms)

2021-01-11 §
09:31	<marostegui>	Sanitize db1155:3314 - T268742	[production]
09:31	<marostegui>	Deploy schema change on s1 codfw master - T270187	[production]
09:02	<elukey>	force puppet on logstash1007 after ES OOM	[production]
08:54	<godog>	swift codfw-prod: more weight to ms-be20[58-61] - T269337	[production]
08:24	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1030.eqiad.wmnet with reason: REIMAGE	[production]
08:22	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2030.codfw.wmnet with reason: REIMAGE	[production]
08:20	<jiji@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mc1030.eqiad.wmnet with reason: REIMAGE	[production]
08:19	<jiji@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mc2030.codfw.wmnet with reason: REIMAGE	[production]
07:49	<dcausse>	depooling & restarting blazegraph on wdqs2007 (T242453)	[production]
07:48	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1136 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P13709 and previous config saved to /var/cache/conftool/dbconfig/20210111-074853-root.json	[production]
07:43	<dcausse>	repool wdqs1007 (wrong machine) (T242453)	[production]
07:41	<dcausse>	depooling & restarting blazegraph on wdqs1007 (T242453)	[production]
07:33	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1136 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P13708 and previous config saved to /var/cache/conftool/dbconfig/20210111-073349-root.json	[production]
07:18	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1136 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P13707 and previous config saved to /var/cache/conftool/dbconfig/20210111-071846-root.json	[production]
07:12	<marostegui>	Deploy schema change on s8 codfw master - T270187	[production]
07:03	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1136 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P13706 and previous config saved to /var/cache/conftool/dbconfig/20210111-070342-root.json	[production]
06:56	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1136', diff saved to https://phabricator.wikimedia.org/P13704 and previous config saved to /var/cache/conftool/dbconfig/20210111-065640-marostegui.json	[production]
06:55	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1079 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P13703 and previous config saved to /var/cache/conftool/dbconfig/20210111-065550-root.json	[production]
06:40	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1079 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P13702 and previous config saved to /var/cache/conftool/dbconfig/20210111-064046-root.json	[production]
06:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1079', diff saved to https://phabricator.wikimedia.org/P13701 and previous config saved to /var/cache/conftool/dbconfig/20210111-063226-marostegui.json	[production]
06:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db1094', diff saved to https://phabricator.wikimedia.org/P13700 and previous config saved to /var/cache/conftool/dbconfig/20210111-063155-marostegui.json	[production]
06:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1094', diff saved to https://phabricator.wikimedia.org/P13699 and previous config saved to /var/cache/conftool/dbconfig/20210111-063124-marostegui.json	[production]
06:04	<marostegui>	Depool db1121 to clone db1155:3314	[production]
06:04	<marostegui>	Deploy schema change on s7 codfw master - T270187	[production]
06:03	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1121', diff saved to https://phabricator.wikimedia.org/P13698 and previous config saved to /var/cache/conftool/dbconfig/20210111-060342-marostegui.json	[production]
2021-01-09 §
00:11	<mutante>	puppetmaster2003 - restarted apache after spweing 500s	[production]
2021-01-08 §
19:48	<andrew@deploy1001>	Finished deploy [striker/deploy@e4db843]: Striker deploy for T269004 (duration: 02m 11s)	[production]
19:45	<andrew@deploy1001>	Started deploy [striker/deploy@e4db843]: Striker deploy for T269004	[production]
19:28	<andrew@deploy1001>	Finished deploy [horizon/deploy@7466703]: Horizon with a bunch of Buster patches (duration: 02m 35s)	[production]
19:26	<andrew@deploy1001>	Started deploy [horizon/deploy@7466703]: Horizon with a bunch of Buster patches	[production]
18:02	<joal@deploy1001>	Finished deploy [analytics/refinery@db9da3c] (thin): Hotfix analytics deployment - THIN [analytics/refinery@db9da3c] (duration: 00m 07s)	[production]
18:02	<joal@deploy1001>	Started deploy [analytics/refinery@db9da3c] (thin): Hotfix analytics deployment - THIN [analytics/refinery@db9da3c]	[production]
18:01	<joal@deploy1001>	Finished deploy [analytics/refinery@db9da3c]: Hotfix analytics deployment [analytics/refinery@db9da3c] (duration: 11m 27s)	[production]
17:50	<joal@deploy1001>	Started deploy [analytics/refinery@db9da3c]: Hotfix analytics deployment [analytics/refinery@db9da3c]	[production]
17:33	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on maps2007.codfw.wmnet with reason: Downtiming while not pooled	[production]
17:33	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on maps2007.codfw.wmnet with reason: Downtiming while not pooled	[production]
17:15	<hnowlan@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2 days, 12:00:00 on maps2007.codfw.wmnet with reason: Downtiming while not pooled	[production]
17:15	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on maps2007.codfw.wmnet with reason: Downtiming while not pooled	[production]
17:15	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on maps1009.eqiad.wmnet with reason: Downtiming while not pooled	[production]
17:15	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on maps1009.eqiad.wmnet with reason: Downtiming while not pooled	[production]
17:10	<andrew@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on labweb1001.wikimedia.org with reason: REIMAGE	[production]
17:08	<andrew@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on labweb1001.wikimedia.org with reason: REIMAGE	[production]
16:50	<razzi@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
16:43	<razzi@cumin1001>	START - Cookbook sre.dns.netbox	[production]
16:42	<andrewbogott>	shutting down labweb1001 so I can really believe that all traffic is being served by 1002	[production]
16:35	<andrew@deploy1001>	Finished deploy [horizon/deploy@7466703]: selective disable of problematic compression block (duration: 01m 42s)	[production]
16:33	<andrew@deploy1001>	Started deploy [horizon/deploy@7466703]: selective disable of problematic compression block	[production]
16:32	<andrew@deploy1001>	Finished deploy [horizon/deploy@7466703]: selective disable of problematic compression block (duration: 01m 52s)	[production]
16:30	<razzi@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)	[production]
16:30	<andrew@deploy1001>	Started deploy [horizon/deploy@7466703]: selective disable of problematic compression block	[production]