5651-5700 of 10000 results (84ms)
2022-06-01 ยง
13:40 <aikochou@deploy1002> Started deploy [ores/deploy@3d541df]: Deploy revscoring 2.11.4 to ORES - T309536 [production]
13:32 <kevinbazira@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
13:32 <kevinbazira@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
12:41 <moritzm> installing ruby-nokogiri security updates [production]
12:24 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1136 (T309617)', diff saved to https://phabricator.wikimedia.org/P29320 and previous config saved to /var/cache/conftool/dbconfig/20220601-122426-ladsgroup.json [production]
12:09 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P29318 and previous config saved to /var/cache/conftool/dbconfig/20220601-120921-ladsgroup.json [production]
11:54 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P29317 and previous config saved to /var/cache/conftool/dbconfig/20220601-115416-ladsgroup.json [production]
11:44 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1137 in x1 with minimal weight to test 10.6.8 T309679 ', diff saved to https://phabricator.wikimedia.org/P29315 and previous config saved to /var/cache/conftool/dbconfig/20220601-114418-marostegui.json [production]
11:39 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1136 (T309617)', diff saved to https://phabricator.wikimedia.org/P29314 and previous config saved to /var/cache/conftool/dbconfig/20220601-113911-ladsgroup.json [production]
11:30 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1136 (T309617)', diff saved to https://phabricator.wikimedia.org/P29313 and previous config saved to /var/cache/conftool/dbconfig/20220601-113017-ladsgroup.json [production]
11:30 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance [production]
11:30 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance [production]
11:21 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
11:21 <ladsgroup@deploy1002> Synchronized php-1.39.0-wmf.13/extensions/PageTriage/includes/Hooks.php: Backport: [[gerrit:802107|Don't call saveOptions in LocalUserCreated (T306636)]] (duration: 03m 16s) [production]
11:20 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
11:20 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
11:19 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
11:18 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1137 in x1 with minimal weight to test 10.6.8 T309679 ', diff saved to https://phabricator.wikimedia.org/P29312 and previous config saved to /var/cache/conftool/dbconfig/20220601-111805-marostegui.json [production]
11:16 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1045.eqiad.wmnet with OS bullseye [production]
11:15 <ladsgroup@deploy1002> Synchronized php-1.39.0-wmf.14/extensions/PageTriage/includes/Hooks.php: Backport: [[gerrit:802106|Don't call saveOptions in LocalUserCreated (T306636)]] (duration: 03m 01s) [production]
11:14 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
11:14 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
11:14 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
11:13 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
11:04 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
11:04 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
10:54 <XioNoX> upgrade fastnetmon to 1.2.1 in eqsin - T271228 [production]
10:51 <XioNoX> upgrade fastnetmon to 1.2.1 in esams - T271228 [production]
10:49 <XioNoX> upgrade fastnetmon to 1.2.1 in eqiad - T271228 [production]
10:48 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1045.eqiad.wmnet with reason: host reimage [production]
10:45 <mvernon@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1045.eqiad.wmnet with reason: host reimage [production]
10:28 <mvernon@cumin2002> START - Cookbook sre.hosts.reimage for host ms-be1045.eqiad.wmnet with OS bullseye [production]
10:13 <moritzm> installing openldap security updates [production]
10:11 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1044.eqiad.wmnet with OS bullseye [production]
09:56 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1044.eqiad.wmnet with reason: host reimage [production]
09:39 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
09:36 <mvernon@cumin2002> START - Cookbook sre.hosts.reimage for host ms-be1044.eqiad.wmnet with OS bullseye [production]
09:08 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1043.eqiad.wmnet with OS bullseye [production]
08:56 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1137 in x1 with minimal weight to test 10.6.8 T309679 ', diff saved to https://phabricator.wikimedia.org/P29307 and previous config saved to /var/cache/conftool/dbconfig/20220601-085620-marostegui.json [production]
08:49 <moritzm> installing idp1002 T308214 [production]
08:48 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1043.eqiad.wmnet with reason: host reimage [production]
08:45 <mvernon@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1043.eqiad.wmnet with reason: host reimage [production]
08:43 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
08:43 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
08:41 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
08:41 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
08:39 <elukey> powercycle an-worker1094 - OEM event registered in `racadm getsel`, host frozen [production]
08:30 <mvernon@cumin2002> START - Cookbook sre.hosts.reimage for host ms-be1043.eqiad.wmnet with OS bullseye [production]
08:30 <mvernon@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be1043.eqiad.wmnet with OS bullseye [production]
08:20 <moritzm> installing openssl security updates [production]