production SAL

2951-3000 of 10000 results (65ms)

2022-12-13 §
22:06	<aqu@deploy1002>	Finished deploy [analytics/refinery@66736e1]: HDFS FSImage conversion to XML script [analytics/refinery@66736e1] (duration: 26m 32s)	[production]
21:53	<cwhite@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2033.codfw.wmnet with reason: host reimage	[production]
21:52	<reedy@deploy1002>	Synchronized wmf-config/CommonSettings.php: extension distributor updates (duration: 06m 50s)	[production]
21:50	<cwhite@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2034.codfw.wmnet with reason: host reimage	[production]
21:48	<cwhite@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2033.codfw.wmnet with reason: host reimage	[production]
21:47	<cwhite@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2034.codfw.wmnet with reason: host reimage	[production]
21:43	<cwhite@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2035.codfw.wmnet with reason: host reimage	[production]
21:41	<kindrobot>	Finishing UTC late backport window	[production]
21:40	<kindrobot@deploy1002>	Finished scap: Backport for [[gerrit:863228\|Start writing to cul_actor everywhere (T233004)]] (duration: 18m 47s)	[production]
21:40	<aqu@deploy1002>	Started deploy [analytics/refinery@66736e1]: HDFS FSImage conversion to XML script [analytics/refinery@66736e1]	[production]
21:39	<cwhite@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2035.codfw.wmnet with reason: host reimage	[production]
21:35	<aqu>	Deploying analytics/refinery (HDFS FSImage conversion to XML script)	[production]
21:32	<cwhite@cumin2002>	START - Cookbook sre.hosts.reimage for host logstash2033.codfw.wmnet with OS bullseye	[production]
21:30	<cwhite@cumin2002>	START - Cookbook sre.hosts.reimage for host logstash2034.codfw.wmnet with OS bullseye	[production]
21:23	<kindrobot@deploy1002>	kindrobot and zabe: Backport for [[gerrit:863228\|Start writing to cul_actor everywhere (T233004)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet	[production]
21:23	<cwhite@cumin2002>	START - Cookbook sre.hosts.reimage for host logstash2035.codfw.wmnet with OS bullseye	[production]
21:21	<kindrobot@deploy1002>	Started scap: Backport for [[gerrit:863228\|Start writing to cul_actor everywhere (T233004)]]	[production]
21:17	<samtar@deploy1002>	Finished scap: Backport for [[gerrit:867233\|Child elements also trigger previews (T325007)]] (duration: 09m 38s)	[production]
21:09	<samtar@deploy1002>	samtar and jdlrobson: Backport for [[gerrit:867233\|Child elements also trigger previews (T325007)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet	[production]
21:08	<samtar@deploy1002>	Started scap: Backport for [[gerrit:867233\|Child elements also trigger previews (T325007)]]	[production]
20:20	<ryankemper@cumin1001>	END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776	[production]
20:17	<ryankemper@cumin1001>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776	[production]
20:17	<ryankemper@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on relforge[1003-1004].eqiad.wmnet with reason: Rolling restart	[production]
20:16	<ryankemper@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on relforge[1003-1004].eqiad.wmnet with reason: Rolling restart	[production]
20:10	<ryankemper@cumin1001>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776	[production]
20:10	<ryankemper@cumin1001>	START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776	[production]
19:18	<dzahn@cumin2002>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts contint1001.wikimedia.org	[production]
19:18	<dzahn@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
19:18	<dzahn@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1001.wikimedia.org decommissioned, removing all IPs except the asset tag one - dzahn@cumin2002"	[production]
19:00	<ladsgroup@deploy1002>	Finished scap: Backport for [[gerrit:867609\|ParserCache: fix metrics keys]], [[gerrit:867611\|Don't write to parser cache from maintenance script]], [[gerrit:867613\|Fix brittle test]] (duration: 07m 53s)	[production]
18:59	<dzahn@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1001.wikimedia.org decommissioned, removing all IPs except the asset tag one - dzahn@cumin2002"	[production]
18:57	<dzahn@cumin2002>	START - Cookbook sre.dns.netbox	[production]
18:54	<ladsgroup@deploy1002>	ladsgroup and ladsgroup: Backport for [[gerrit:867609\|ParserCache: fix metrics keys]], [[gerrit:867611\|Don't write to parser cache from maintenance script]], [[gerrit:867613\|Fix brittle test]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet	[production]
18:52	<ladsgroup@deploy1002>	Started scap: Backport for [[gerrit:867609\|ParserCache: fix metrics keys]], [[gerrit:867611\|Don't write to parser cache from maintenance script]], [[gerrit:867613\|Fix brittle test]]	[production]
18:51	<dzahn@cumin2002>	START - Cookbook sre.hosts.decommission for hosts contint1001.wikimedia.org	[production]
18:51	<mutante>	decom'ing contint1001 (formerly prod CI) server, replaced by contint1002 T324698	[production]
18:47	<ladsgroup@deploy1002>	Finished scap: Backport for [[gerrit:867610\|Don't write to parser cache from maintenance script]], [[gerrit:867608\|ParserCache: fix metrics keys]] (duration: 09m 25s)	[production]
18:40	<ladsgroup@deploy1002>	ladsgroup and ladsgroup: Backport for [[gerrit:867610\|Don't write to parser cache from maintenance script]], [[gerrit:867608\|ParserCache: fix metrics keys]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet	[production]
18:38	<ladsgroup@deploy1002>	Started scap: Backport for [[gerrit:867610\|Don't write to parser cache from maintenance script]], [[gerrit:867608\|ParserCache: fix metrics keys]]	[production]
18:21	<btullis@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-stretch1001.eqiad.wmnet with OS bullseye	[production]
17:27	<btullis@cumin1001>	START - Cookbook sre.hosts.reimage for host kafka-stretch1001.eqiad.wmnet with OS bullseye	[production]
17:26	<btullis>	edited automation/proxies/ttyS1-115200.conf to remove `include "/etc/dhcp/automation/ttyS1-115200/kafka-stretch1001.conf";`and restarted isc-dhc-server	[production]
17:22	<btullis>	btullis@install1003:/etc/dhcp/automation/ttyS1-115200$ sudo systemctl restart isc-dhcp-server.service T314156	[production]
16:35	<btullis@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-stretch1001.eqiad.wmnet with OS bullseye	[production]
16:35	<btullis@cumin1001>	START - Cookbook sre.hosts.reimage for host kafka-stretch1001.eqiad.wmnet with OS bullseye	[production]
16:17	<hnowlan@puppetmaster1001>	conftool action : set/weight=10:pooled=no; selector: service=thumbor,name=kubernetes1010.eqiad.wmnet	[production]
16:15	<hnowlan@puppetmaster1001>	conftool action : set/weight=10:pooled=yes; selector: service=thumbor,name=kubernetes1010.eqiad.wmnet	[production]
16:12	<btullis@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-stretch2002.codfw.wmnet with OS bullseye	[production]
16:05	<hnowlan@puppetmaster1001>	conftool action : set/pooled=no; selector: service=thumbor,name=kubernetes101[01234].eqiad.wmnet	[production]
16:03	<moritzm>	installing ruby-tzinfo security updates	[production]