production SAL

501-550 of 10000 results (28ms)

2021-08-10 §
08:20	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
08:20	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
08:19	<jayme@deploy1002>	helmfile [codfw] DONE helmfile.d/admin 'apply'.	[production]
08:18	<jayme@deploy1002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
08:16	<jayme@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:16	<jayme@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
08:15	<jayme@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:15	<jayme@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
08:15	<jayme@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
08:14	<jayme@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
08:06	<godog>	upload thanos 0.21.1-1 and upgrade prometheus1004 / thanos-fe2001 to it - T288326	[production]
08:03	<moritzm>	installing openjdk-8 security updates on stretch	[production]
07:33	<moritzm>	installing lynx security updates	[production]
05:56	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2104 (re)pooling @ 100%: repool after failed switchover', diff saved to https://phabricator.wikimedia.org/P16987 and previous config saved to /var/cache/conftool/dbconfig/20210810-055642-root.json	[production]
05:41	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2104 (re)pooling @ 75%: repool after failed switchover', diff saved to https://phabricator.wikimedia.org/P16986 and previous config saved to /var/cache/conftool/dbconfig/20210810-054139-root.json	[production]
05:26	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2104 (re)pooling @ 50%: repool after failed switchover', diff saved to https://phabricator.wikimedia.org/P16985 and previous config saved to /var/cache/conftool/dbconfig/20210810-052635-root.json	[production]
05:11	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2104 (re)pooling @ 25%: repool after failed switchover', diff saved to https://phabricator.wikimedia.org/P16984 and previous config saved to /var/cache/conftool/dbconfig/20210810-051131-root.json	[production]
05:06	<marostegui@cumin1001>	dbctl commit (dc=all): 'Set s2 as read-write again - master has not been swapped T287454', diff saved to https://phabricator.wikimedia.org/P16983 and previous config saved to /var/cache/conftool/dbconfig/20210810-050604-root.json	[production]
05:00	<marostegui@cumin1001>	dbctl commit (dc=all): 'Set s2 codfw as read-only for maintenance - T287454', diff saved to https://phabricator.wikimedia.org/P16982 and previous config saved to /var/cache/conftool/dbconfig/20210810-050051-root.json	[production]
05:00	<marostegui>	Starting s2 codfw failover from db2107 to db2104 - T287454	[production]
04:23	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Master switchover s2 T287454	[production]
04:23	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Master switchover s2 T287454	[production]
04:16	<marostegui@cumin1001>	dbctl commit (dc=all): 'Set db2104 with weight 0 T287454', diff saved to https://phabricator.wikimedia.org/P16981 and previous config saved to /var/cache/conftool/dbconfig/20210810-041627-root.json	[production]
02:35	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
02:33	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
02:07	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
02:06	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
2021-08-09 §
16:12	<legoktm@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .	[production]
16:10	<jayme@deploy1002>	helmfile [codfw] DONE helmfile.d/admin 'apply'.	[production]
16:09	<jayme@deploy1002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
16:07	<legoktm@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .	[production]
16:07	<jayme@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
16:07	<jayme@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
16:04	<legoktm@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .	[production]
16:03	<jayme@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
16:03	<jayme@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
16:03	<jayme@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
16:02	<legoktm@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' .	[production]
16:02	<jayme@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
16:00	<legoktm@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' .	[production]
16:00	<jayme@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
16:00	<jayme@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
15:57	<legoktm@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .	[production]
15:34	<filippo@cumin1001>	END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2065.codfw.wmnet	[production]
15:33	<filippo@cumin1001>	END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2064.codfw.wmnet	[production]
15:33	<filippo@cumin1001>	END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2062.codfw.wmnet	[production]
14:17	<sukhe>	ran homer for Gerrit 710358: Set up BGP peering to doh5002 in eqsin	[production]
14:10	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet	[production]
14:09	<hnowlan@puppetmaster1001>	conftool action : set/pooled=no; selector: name=maps100[1234].eqiad.wmnet	[production]
14:06	<jayme>	re-enabled (and ran) puppet on all kubernetes nodes - T288345	[production]