501-550 of 10000 results (34ms)
2021-08-10 §
08:20 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
08:20 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
08:19 <jayme@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
08:18 <jayme@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
08:16 <jayme@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
08:16 <jayme@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
08:15 <jayme@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
08:15 <jayme@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
08:15 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
08:14 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
08:06 <godog> upload thanos 0.21.1-1 and upgrade prometheus1004 / thanos-fe2001 to it - T288326 [production]
08:03 <moritzm> installing openjdk-8 security updates on stretch [production]
07:33 <moritzm> installing lynx security updates [production]
05:56 <marostegui@cumin1001> dbctl commit (dc=all): 'db2104 (re)pooling @ 100%: repool after failed switchover', diff saved to https://phabricator.wikimedia.org/P16987 and previous config saved to /var/cache/conftool/dbconfig/20210810-055642-root.json [production]
05:41 <marostegui@cumin1001> dbctl commit (dc=all): 'db2104 (re)pooling @ 75%: repool after failed switchover', diff saved to https://phabricator.wikimedia.org/P16986 and previous config saved to /var/cache/conftool/dbconfig/20210810-054139-root.json [production]
05:26 <marostegui@cumin1001> dbctl commit (dc=all): 'db2104 (re)pooling @ 50%: repool after failed switchover', diff saved to https://phabricator.wikimedia.org/P16985 and previous config saved to /var/cache/conftool/dbconfig/20210810-052635-root.json [production]
05:11 <marostegui@cumin1001> dbctl commit (dc=all): 'db2104 (re)pooling @ 25%: repool after failed switchover', diff saved to https://phabricator.wikimedia.org/P16984 and previous config saved to /var/cache/conftool/dbconfig/20210810-051131-root.json [production]
05:06 <marostegui@cumin1001> dbctl commit (dc=all): 'Set s2 as read-write again - master has not been swapped T287454', diff saved to https://phabricator.wikimedia.org/P16983 and previous config saved to /var/cache/conftool/dbconfig/20210810-050604-root.json [production]
05:00 <marostegui@cumin1001> dbctl commit (dc=all): 'Set s2 codfw as read-only for maintenance - T287454', diff saved to https://phabricator.wikimedia.org/P16982 and previous config saved to /var/cache/conftool/dbconfig/20210810-050051-root.json [production]
05:00 <marostegui> Starting s2 codfw failover from db2107 to db2104 - T287454 [production]
04:23 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Master switchover s2 T287454 [production]
04:23 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Master switchover s2 T287454 [production]
04:16 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db2104 with weight 0 T287454', diff saved to https://phabricator.wikimedia.org/P16981 and previous config saved to /var/cache/conftool/dbconfig/20210810-041627-root.json [production]
02:35 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
02:33 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
02:07 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
02:06 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
2021-08-09 §
16:12 <legoktm@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' . [production]
16:10 <jayme@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
16:09 <jayme@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
16:07 <legoktm@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' . [production]
16:07 <jayme@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
16:07 <jayme@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
16:04 <legoktm@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' . [production]
16:03 <jayme@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
16:03 <jayme@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
16:03 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
16:02 <legoktm@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
16:02 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
16:00 <legoktm@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
16:00 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
16:00 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
15:57 <legoktm@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
15:34 <filippo@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2065.codfw.wmnet [production]
15:33 <filippo@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2064.codfw.wmnet [production]
15:33 <filippo@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2062.codfw.wmnet [production]
14:17 <sukhe> ran homer for Gerrit 710358: Set up BGP peering to doh5002 in eqsin [production]
14:10 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet [production]
14:09 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=maps100[1234].eqiad.wmnet [production]
14:06 <jayme> re-enabled (and ran) puppet on all kubernetes nodes - T288345 [production]