production SAL

1201-1250 of 10000 results (35ms)

2021-09-07 §
17:09	<jgiannelos@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' .	[production]
17:01	<jgiannelos@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' .	[production]
16:39	<moritzm>	installing jetty9 security updates on buster	[production]
16:30	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue	[production]
16:30	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue	[production]
16:30	<dancy@deploy1002>	Synchronized README: testing (duration: 00m 59s)	[production]
15:18	<akosiaris>	run_benchmarky.py against mwdebug.svc.codfw.wmnet for performance tests	[production]
15:07	<akosiaris@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
15:04	<jbond>	upload python-prometheus-client_0.6.0 to stretch-wikimedia	[production]
14:50	<mutante>	snapshot1015 - manually removed prometheus-puppet-agent-stats from crontab which was sending spam and is now a timer	[production]
14:33	<mutante>	CI - migrating zuul-merger cronjob to systemd timer (contint*)	[production]
14:23	<XioNoX>	re-pool esams-eqiad - T288503	[production]
14:23	<cmjohnson@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE	[production]
14:23	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE	[production]
14:22	<cmjohnson@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE	[production]
14:22	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE	[production]
14:17	<marostegui>	No more db maintenance on eqiad T288594	[production]
14:08	<mutante>	alert1001 - temp disabled puppet, stopped icinga-wm	[production]
14:07	<mutante>	temp killed icinga-wm because of flooding	[production]
14:01	<Emperor>	removing pc2010 from orchestrator T289117	[production]
13:59	<Emperor>	removing pc2010 from tendril and zarcillo T289117	[production]
13:57	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
13:57	<XioNoX>	drain esams-eqiad for circuit maintenance - T288503	[production]
13:54	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
13:51	<jayme>	uncordoned kubestage2001	[production]
13:50	<jiji@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
13:49	<mutante>	mw2264 - scap pulled and repooled after T290242	[production]
13:49	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2264.codfw.wmnet	[production]
13:43	<jiji@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
13:40	<mvernon@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2010.codfw.wmnet	[production]
13:25	<mvernon@cumin1001>	START - Cookbook sre.hosts.decommission for hosts pc2010.codfw.wmnet	[production]
13:21	<Emperor>	removing pc2009 from orchestrator T289116	[production]
13:21	<Emperor>	removing pc2009 from tendril and zarcillo T289116	[production]
13:02	<marostegui@cumin1001>	dbctl commit (dc=all): 'fix s8 weights T288594', diff saved to https://phabricator.wikimedia.org/P17248 and previous config saved to /var/cache/conftool/dbconfig/20210907-130244-marostegui.json	[production]
12:59	<mvernon@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2009.codfw.wmnet	[production]
12:51	<mvernon@deploy1002>	Synchronized wmf-config/ProductionServices.php: Remove old decommissioned pc hosts T284825 (duration: 01m 02s)	[production]
12:45	<mvernon@cumin1001>	START - Cookbook sre.hosts.decommission for hosts pc2009.codfw.wmnet	[production]
12:27	<marostegui@cumin1001>	dbctl commit (dc=all): 'fix s1 weights T288594', diff saved to https://phabricator.wikimedia.org/P17247 and previous config saved to /var/cache/conftool/dbconfig/20210907-122747-marostegui.json	[production]
12:27	<marostegui@cumin1001>	dbctl commit (dc=all): 'fix s1 weights T288594', diff saved to https://phabricator.wikimedia.org/P17246 and previous config saved to /var/cache/conftool/dbconfig/20210907-122708-marostegui.json	[production]
11:46	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 6 hosts	[production]
11:46	<btullis@cumin1001>	START - Cookbook sre.hosts.remove-downtime for 6 hosts	[production]
11:36	<awight>	EU backport complete	[production]
11:33	<awight@deploy1002>	Synchronized php-1.37.0-wmf.21/extensions/CodeMirror/extension.json: Backport: [[gerrit:719170\|Change line numbers default to null (T290226)]] (duration: 00m 59s)	[production]
11:28	<awight@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:717192\|Set template namespace for code mirror line numbering (T290226)]] (duration: 00m 59s)	[production]
10:51	<Emperor>	removing pc2008 from orchestrator T289115	[production]
10:49	<Emperor>	removing pc2008 from tendril and zarcillo T289115	[production]
10:46	<mvernon@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2008.codfw.wmnet	[production]
10:35	<mvernon@cumin1001>	START - Cookbook sre.hosts.decommission for hosts pc2008.codfw.wmnet	[production]
10:29	<btullis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on 6 hosts with reason: commissioning aqs_new hosts	[production]
10:29	<btullis@cumin1001>	START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on 6 hosts with reason: commissioning aqs_new hosts	[production]