1451-1500 of 10000 results (79ms)
2022-10-15 §
22:44 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Set db1173 with weight 0 T320879', diff saved to https://phabricator.wikimedia.org/P35492 and previous config saved to /var/cache/conftool/dbconfig/20221015-224455-ladsgroup.json [production]
22:44 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s6 T320879 [production]
22:44 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s6 T320879 [production]
2022-10-14 §
22:56 <mutante> pcc-worker1003.puppet-diffs.eqiad1.wikimedia.cloud - out of disk space again - deleted 3.5GB job "1460" to unblock puppet compiling [production]
20:48 <jhathaway@cumin1001> END (PASS) - Cookbook sre.network.cf (exit_code=0) [production]
20:48 <jhathaway@cumin1001> START - Cookbook sre.network.cf [production]
19:57 <oblivian@cumin1001> END (PASS) - Cookbook sre.network.cf (exit_code=0) [production]
19:57 <oblivian@cumin1001> START - Cookbook sre.network.cf [production]
19:55 <oblivian@cumin1001> END (PASS) - Cookbook sre.network.cf (exit_code=0) [production]
19:55 <oblivian@cumin1001> START - Cookbook sre.network.cf [production]
18:08 <mutante> contint* - temp disabled puppet, deploying gerrit:834400, docker version upgrade on CI servers (T318382) [production]
15:49 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
15:48 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
15:46 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
15:45 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
15:44 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
15:43 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
15:40 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
15:40 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
14:48 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . [production]
14:47 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . [production]
14:47 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . [production]
14:43 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . [production]
14:42 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . [production]
14:40 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . [production]
14:32 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [production]
14:31 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [production]
14:29 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [production]
14:29 <jclark@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
14:29 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . [production]
14:28 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . [production]
14:27 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . [production]
14:27 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . [production]
14:27 <jclark@cumin1001> START - Cookbook sre.dns.netbox [production]
14:27 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . [production]
14:22 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . [production]
14:21 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . [production]
14:19 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . [production]
14:17 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . [production]
14:09 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
14:09 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
14:06 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
13:59 <jclark@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
13:57 <jclark@cumin1001> START - Cookbook sre.dns.netbox [production]
13:57 <jclark@cumin1001> END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) [production]
13:55 <jclark@cumin1001> START - Cookbook sre.dns.netbox [production]
12:01 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depool db1202 - Degraded RAID (T320786)', diff saved to https://phabricator.wikimedia.org/P35487 and previous config saved to /var/cache/conftool/dbconfig/20221014-120155-ladsgroup.json [production]
10:22 <godog> upgrade grafana to 8.5.14 [production]
10:15 <dcausse> Deployed patch for T320785 [production]
08:47 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[2028-2030].codfw.wmnet [production]