production SAL

4801-4850 of 10000 results (69ms)

2022-05-26 §
01:01	<mutante>	gitlab1003 - T308089 T274463 - gitlab1003 - systemctl status backup-restore is failed because it's looking for /mnt/gitlab-backup/latest/latest.tar needs gerrit:799016	[production]
00:58	<mutante>	gitlab1001 - T308089 T274463 - gitlab1001 - systemctl start full-backup	[production]
00:56	<mutante>	gitlab1001 - T308089 T274463 - '<+icinga-wm> PROBLEM - Disk space on gitlab1001 is CRITICAL: DISK CRITICAL - free space: /mnt/gitlab-backup 0 MB' - manually deleted 1653294190_2022_05_23_14.10.2_gitlab_backup.tar (we have May 24 and 25, 26 could not finish writing backup) - RECOVERY - Disk space on gitlab1001 is OK	[production]
2022-05-25 §
23:35	<bd808@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply	[production]
23:35	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1184 (T298555)', diff saved to https://phabricator.wikimedia.org/P28563 and previous config saved to /var/cache/conftool/dbconfig/20220525-233520-ladsgroup.json	[production]
23:35	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1184.eqiad.wmnet with reason: Maintenance	[production]
23:35	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 10:00:00 on db1184.eqiad.wmnet with reason: Maintenance	[production]
23:35	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298555)', diff saved to https://phabricator.wikimedia.org/P28562 and previous config saved to /var/cache/conftool/dbconfig/20220525-233512-ladsgroup.json	[production]
23:35	<bd808@deploy1002>	helmfile [eqiad] START helmfile.d/services/developer-portal: apply	[production]
23:20	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P28561 and previous config saved to /var/cache/conftool/dbconfig/20220525-232007-ladsgroup.json	[production]
23:05	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P28560 and previous config saved to /var/cache/conftool/dbconfig/20220525-230502-ladsgroup.json	[production]
22:49	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298555)', diff saved to https://phabricator.wikimedia.org/P28559 and previous config saved to /var/cache/conftool/dbconfig/20220525-224957-ladsgroup.json	[production]
22:47	<bd808@deploy1002>	helmfile [codfw] DONE helmfile.d/services/developer-portal: apply	[production]
22:47	<bd808@deploy1002>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
22:46	<bd808@deploy1002>	helmfile [codfw] DONE helmfile.d/services/developer-portal: apply	[production]
22:45	<bd808@deploy1002>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
22:06	<cmooney@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
22:03	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
21:47	<cmooney@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
21:45	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
21:45	<cmooney@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
21:21	<ejegg>	updated Fundraising CiviCRM from b8b8c177 to dc72ad44	[production]
21:06	<joal@deploy1002>	Finished deploy [airflow-dags/analytics_test@3ae51e7]: (no justification provided) (duration: 00m 06s)	[production]
21:06	<joal@deploy1002>	Started deploy [airflow-dags/analytics_test@3ae51e7]: (no justification provided)	[production]
20:37	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1106 (T298555)', diff saved to https://phabricator.wikimedia.org/P28558 and previous config saved to /var/cache/conftool/dbconfig/20220525-203708-ladsgroup.json	[production]
20:37	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
20:37	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
20:37	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1106.eqiad.wmnet with reason: Maintenance	[production]
20:36	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 10:00:00 on db1106.eqiad.wmnet with reason: Maintenance	[production]
20:35	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:34	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:34	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:33	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:32	<cjming>	end of UTC late backport window	[production]
20:28	<cjming@deploy1002>	Synchronized wmf-config/CirrusSearch-common.php: Config: [[gerrit:775965\|cirrus: Migrate popularity_score configuration]] (duration: 00m 51s)	[production]
20:23	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:22	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:22	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:21	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:16	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:15	<bd808@deploy1002>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
20:15	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:15	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:14	<cjming@deploy1002>	Synchronized logos/config.yaml: Config: [[gerrit:793027\|zhwikivoyage: Declare commons files for logo and its variant (T308620)]] (duration: 00m 49s)	[production]
20:14	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:13	<cjming@deploy1002>	Synchronized wmf-config/logos.php: Config: [[gerrit:793027\|zhwikivoyage: Declare commons files for logo and its variant (T308620)]] (duration: 01m 25s)	[production]
20:09	<cjming@deploy1002>	Synchronized static/images/project-logos: Config: [[gerrit:793125\|zhwikivoyage: Generate zh-hant logo variant (T308620)]] (duration: 00m 50s)	[production]
20:08	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance	[production]
20:08	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance	[production]
20:04	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1173.eqiad.wmnet with reason: Maintenance	[production]