production SAL

201-250 of 10000 results (72ms)

2024-02-22 §
16:16	<volans@cumin1002>	START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox	[production]
16:11	<mvernon@cumin2002>	conftool action : set/pooled=true; selector: dnsdisc=swift,name=codfw	[production]
16:11	<Emperor>	repool codfs-mw T355868	[production]
16:10	<Emperor>	repool thanos-fe2002 T355868	[production]
16:07	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1214 (T357189)', diff saved to https://phabricator.wikimedia.org/P57740 and previous config saved to /var/cache/conftool/dbconfig/20240222-160753-arnaudb.json	[production]
16:06	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2149 (re)pooling @ 75%: After recloning', diff saved to https://phabricator.wikimedia.org/P57739 and previous config saved to /var/cache/conftool/dbconfig/20240222-160646-root.json	[production]
16:05	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Depooling db1214 (T357189)', diff saved to https://phabricator.wikimedia.org/P57738 and previous config saved to /var/cache/conftool/dbconfig/20240222-160534-arnaudb.json	[production]
16:05	<arnaudb@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance	[production]
16:05	<volans@cumin1002>	END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1001.eqiad.wmnet	[production]
16:05	<arnaudb@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance	[production]
16:05	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57737 and previous config saved to /var/cache/conftool/dbconfig/20240222-160512-arnaudb.json	[production]
16:04	<volans@cumin1002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet	[production]
16:00	<topranks>	Commencing network maintenance migrating servers to new switch codfw rack B2 T355868	[production]
15:58	<cmooney@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2002.codfw.wmnet with OS bullseye	[production]
15:57	<hnowlan>	depooling mw[1458,1467-1468,1483-1485,1494].eqiad.wmnet in advance of reimaging	[production]
15:56	<cmooney@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B2 to lsw1-b2-codfw	[production]
15:55	<cmooney@cumin1002>	START - Cookbook sre.hosts.downtime for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B2 to lsw1-b2-codfw	[production]
15:54	<mvernon@cumin2002>	conftool action : set/pooled=false; selector: dnsdisc=swift,name=codfw	[production]
15:54	<Emperor>	depool codfs-mw T355868	[production]
15:53	<Emperor>	depool thanos-fe2002 T355868	[production]
15:51	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: After recloning', diff saved to https://phabricator.wikimedia.org/P57736 and previous config saved to /var/cache/conftool/dbconfig/20240222-155141-root.json	[production]
15:50	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P57735 and previous config saved to /var/cache/conftool/dbconfig/20240222-155005-arnaudb.json	[production]
15:48	<cmooney@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b2-codfw.mgmt with reason: prepping for server uplink migration codfw rack b2	[production]
15:48	<cmooney@cumin1002>	START - Cookbook sre.hosts.downtime for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b2-codfw.mgmt with reason: prepping for server uplink migration codfw rack b2	[production]
15:46	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2031-2032].codfw.wmnet with reason: T355868	[production]
15:46	<sukhe@cumin2002>	START - Cookbook sre.hosts.downtime for 3:00:00 on cp[2031-2032].codfw.wmnet with reason: T355868	[production]
15:39	<aqu@deploy2002>	Finished deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster (duration: 00m 16s)	[production]
15:39	<aqu@deploy2002>	Started deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster	[production]
15:36	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: After recloning', diff saved to https://phabricator.wikimedia.org/P57734 and previous config saved to /var/cache/conftool/dbconfig/20240222-153636-root.json	[production]
15:35	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P57733 and previous config saved to /var/cache/conftool/dbconfig/20240222-153459-arnaudb.json	[production]
15:32	<cmooney@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage	[production]
15:27	<moritzm>	installing glib2.0 security updates on bullseye	[production]
15:27	<cmooney@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage	[production]
15:21	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2149 (re)pooling @ 10%: After recloning', diff saved to https://phabricator.wikimedia.org/P57732 and previous config saved to /var/cache/conftool/dbconfig/20240222-152131-root.json	[production]
15:19	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57731 and previous config saved to /var/cache/conftool/dbconfig/20240222-151952-arnaudb.json	[production]
15:17	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Depooling db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57730 and previous config saved to /var/cache/conftool/dbconfig/20240222-151733-arnaudb.json	[production]
15:17	<arnaudb@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance	[production]
15:17	<arnaudb@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance	[production]
15:17	<arnaudb@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57729 and previous config saved to /var/cache/conftool/dbconfig/20240222-151701-arnaudb.json	[production]
15:15	<cmooney@cumin1002>	START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye	[production]
15:15	<akosiaris@cumin1002>	conftool action : set/pooled=yes; selector: service=parsoid-php,name=kubernetes.*	[production]
15:15	<akosiaris>	T357392 pool 46 kubernetes hosts of parsoid-php with a weight of 1. Since the 42 parse hosts are at weight 110, that means 1% goes to mw-parsoid deployment, aka mw-on-k8s	[production]
15:13	<akosiaris@cumin1002>	conftool action : set/weight=1; selector: service=parsoid-php,name=kubernetes.*	[production]
15:12	<akosiaris@cumin1002>	conftool action : set/weight=110; selector: service=parsoid-php,name=(pars.\|mw.)	[production]
15:12	<akosiaris>	Bump weight of old parsoid hosts from 10 to 110. This is a noop right now but will makes calculations later spelled out in T357392 possible.	[production]
14:55	<akosiaris@deploy2002>	helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply	[production]
14:55	<akosiaris@deploy2002>	helmfile [codfw] START helmfile.d/services/mw-parsoid: apply	[production]
14:55	<akosiaris@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply	[production]
14:55	<akosiaris@deploy2002>	helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply	[production]
14:51	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2149 (re)pooling @ 1%: After recloning', diff saved to https://phabricator.wikimedia.org/P57726 and previous config saved to /var/cache/conftool/dbconfig/20240222-145120-root.json	[production]