2023-02-10
§
|
15:56 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ml-staging[2001-2002].codfw.wmnet,ml-staging-ctrl[2001-2002].codfw.wmnet,ml-staging-etcd2003.codfw.wmnet with reason: Cluster half broken, in the middle of upgrading |
[production] |
15:49 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
15:49 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ml-staging-etcd2002.codfw.wmnet with OS bullseye |
[production] |
14:53 |
<elukey@cumin1001> |
START - Cookbook sre.ganeti.reimage for host ml-staging-etcd2002.codfw.wmnet with OS bullseye |
[production] |
14:52 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
14:43 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
14:36 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ml-staging-etcd2001.codfw.wmnet with OS bullseye |
[production] |
14:36 |
<elukey@cumin1001> |
START - Cookbook sre.ganeti.reimage for host ml-staging-etcd2001.codfw.wmnet with OS bullseye |
[production] |
14:33 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
13:49 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
13:48 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
07:59 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
07:59 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
07:43 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
07:41 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
2023-02-02
§
|
17:12 |
<elukey@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/changeprop: sync |
[production] |
17:12 |
<elukey@deploy1002> |
helmfile [eqiad] START helmfile.d/services/changeprop: sync |
[production] |
16:47 |
<elukey@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/changeprop: sync |
[production] |
16:46 |
<elukey@deploy1002> |
helmfile [codfw] START helmfile.d/services/changeprop: sync |
[production] |
11:22 |
<elukey@deploy1002> |
helmfile [staging] DONE helmfile.d/services/changeprop: sync |
[production] |
11:22 |
<elukey@deploy1002> |
helmfile [staging] START helmfile.d/services/changeprop: sync |
[production] |
09:55 |
<elukey@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync |
[production] |
09:54 |
<elukey@deploy1002> |
helmfile [eqiad] START helmfile.d/services/eventgate-main: sync |
[production] |
09:40 |
<elukey@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync |
[production] |
09:40 |
<elukey@deploy1002> |
helmfile [codfw] START helmfile.d/services/eventgate-main: sync |
[production] |
09:13 |
<elukey@deploy1002> |
helmfile [staging] DONE helmfile.d/services/eventgate-main: sync |
[production] |
09:12 |
<elukey@deploy1002> |
helmfile [staging] START helmfile.d/services/eventgate-main: sync |
[production] |
09:11 |
<elukey> |
roll restart of eventgate-main pods in wikikube eqiad/codfw to pick up new stream configs - T328576 |
[production] |
2023-02-01
§
|
16:23 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . |
[production] |
16:22 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
14:41 |
<awight@deploy1002> |
elukey and awight: Backport for [[gerrit:884155|wmf-config: add new revision-score streams for EventGate main (T317768)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet |
[production] |
10:42 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
10:42 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
10:42 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
10:42 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
10:41 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
10:41 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
08:27 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
08:27 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
08:27 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
08:27 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |