production SAL

5851-5900 of 10000 results (77ms)

2022-10-31 §
07:02	<ryankemper>	[WDQS] `ryankemper@wdqs1007:~$ sudo systemctl restart wdqs-blazegraph.service`	[production]
07:00	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2124 (T321123)', diff saved to https://phabricator.wikimedia.org/P37049 and previous config saved to /var/cache/conftool/dbconfig/20221031-070029-marostegui.json	[production]
06:58	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db2124 (T321123)', diff saved to https://phabricator.wikimedia.org/P37048 and previous config saved to /var/cache/conftool/dbconfig/20221031-065817-marostegui.json	[production]
06:58	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance	[production]
06:58	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance	[production]
06:57	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2117 (T321123)', diff saved to https://phabricator.wikimedia.org/P37047 and previous config saved to /var/cache/conftool/dbconfig/20221031-065756-marostegui.json	[production]
06:42	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P37046 and previous config saved to /var/cache/conftool/dbconfig/20221031-064249-marostegui.json	[production]
06:12	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2117 (T321123)', diff saved to https://phabricator.wikimedia.org/P37044 and previous config saved to /var/cache/conftool/dbconfig/20221031-061236-marostegui.json	[production]
06:10	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db2117 (T321123)', diff saved to https://phabricator.wikimedia.org/P37043 and previous config saved to /var/cache/conftool/dbconfig/20221031-061026-marostegui.json	[production]
06:10	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance	[production]
06:09	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance	[production]
06:08	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1122.eqiad.wmnet with reason: Maintenance	[production]
06:08	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on db1122.eqiad.wmnet with reason: Maintenance	[production]
05:42	<kartik@deploy1002>	helmfile [staging] DONE helmfile.d/services/cxserver: apply	[production]
05:41	<kartik@deploy1002>	helmfile [staging] START helmfile.d/services/cxserver: apply	[production]
2022-10-29 §
11:25	<taavi>	deploy patch for T321971	[production]
11:22	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-debug: apply	[production]
11:21	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-debug: apply	[production]
11:21	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply	[production]
11:21	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mw-debug: apply	[production]
2022-10-28 §
20:42	<mutante>	clouddumps* - deployed gerrit:848444 - as kind of expected it fails - most likely the project dirs are not automatically created before rsync runs the first time - T57503	[production]
20:37	<mutante>	clouddumps1001 - puppet run after merging gerrit:848441 for kiwix, changed ferm status from "stopped" to "running". manually ran 'sudo systemctl start kiwix-mirror-update' T57503	[production]
19:17	<mutante>	contint* - changing source for scap repo to gitlab - gerrit:850246 T321847	[production]
18:54	<ebernhardson@deploy1002>	Finished deploy [wikimedia/discovery/analytics@2326f9c]: Import cirrus indexes to hdfs (duration: 02m 07s)	[production]
18:52	<ebernhardson@deploy1002>	Started deploy [wikimedia/discovery/analytics@2326f9c]: Import cirrus indexes to hdfs	[production]
18:44	<bking@cumin2002>	END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)	[production]
18:11	<bking@cumin2002>	START - Cookbook sre.wdqs.data-transfer	[production]
18:08	<bking@cumin2002>	END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)	[production]
18:08	<bking@cumin2002>	START - Cookbook sre.wdqs.data-transfer	[production]
17:31	<sukhe@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=cp4052.ulsfo.wmnet	[production]
17:31	<sukhe@puppetmaster1001>	conftool action : set/weight=1; selector: name=cp4052.ulsfo.wmnet,service=varnish-fe	[production]
17:31	<sukhe@puppetmaster1001>	conftool action : set/weight=1; selector: name=cp4052.ulsfo.wmnet,service=ats-tls	[production]
17:31	<sukhe@puppetmaster1001>	conftool action : set/weight=100; selector: name=cp4052.ulsfo.wmnet,service=ats-be	[production]
17:28	<robh@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS buster	[production]
17:09	<cparle@deploy1002>	Finished deploy [airflow-dags/platform_eng@c849762]: (no justification provided) (duration: 00m 05s)	[production]
17:09	<cparle@deploy1002>	Started deploy [airflow-dags/platform_eng@c849762]: (no justification provided)	[production]
17:07	<xcollazo@deploy1002>	Finished deploy [airflow-dags/platform_eng@c849762]: (no justification provided) (duration: 00m 11s)	[production]
17:07	<xcollazo@deploy1002>	Started deploy [airflow-dags/platform_eng@c849762]: (no justification provided)	[production]
17:02	<robh@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage	[production]
16:57	<robh@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage	[production]
16:38	<mforns@deploy1002>	Finished deploy [airflow-dags/analytics@62b4181]: testing scap since we are having problems with other instances (duration: 00m 04s)	[production]
16:38	<mforns@deploy1002>	Started deploy [airflow-dags/analytics@62b4181]: testing scap since we are having problems with other instances	[production]
16:31	<robh@cumin2002>	START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS buster	[production]
16:31	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance	[production]
16:31	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance	[production]
16:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1197 (T321123)', diff saved to https://phabricator.wikimedia.org/P37038 and previous config saved to /var/cache/conftool/dbconfig/20221028-163102-marostegui.json	[production]
16:29	<robh@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4052.ulsfo.wmnet with OS buster	[production]
16:29	<robh@cumin2002>	START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS buster	[production]
16:27	<robh@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4052.ulsfo.wmnet with OS buster	[production]
16:27	<robh@cumin2002>	START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS buster	[production]