|
2023-02-10
§
|
| 15:56 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ml-staging[2001-2002].codfw.wmnet,ml-staging-ctrl[2001-2002].codfw.wmnet,ml-staging-etcd2003.codfw.wmnet with reason: Cluster half broken, in the middle of upgrading |
[production] |
| 15:49 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 15:49 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ml-staging-etcd2002.codfw.wmnet with OS bullseye |
[production] |
| 14:53 |
<elukey@cumin1001> |
START - Cookbook sre.ganeti.reimage for host ml-staging-etcd2002.codfw.wmnet with OS bullseye |
[production] |
| 14:52 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 14:43 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 14:36 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ml-staging-etcd2001.codfw.wmnet with OS bullseye |
[production] |
| 14:36 |
<elukey@cumin1001> |
START - Cookbook sre.ganeti.reimage for host ml-staging-etcd2001.codfw.wmnet with OS bullseye |
[production] |
| 14:33 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 13:49 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 13:48 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 07:59 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 07:59 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 07:43 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
| 07:41 |
<elukey@cumin1001> |
START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 |
[production] |
|
2023-02-02
§
|
| 17:12 |
<elukey@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/changeprop: sync |
[production] |
| 17:12 |
<elukey@deploy1002> |
helmfile [eqiad] START helmfile.d/services/changeprop: sync |
[production] |
| 16:47 |
<elukey@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/changeprop: sync |
[production] |
| 16:46 |
<elukey@deploy1002> |
helmfile [codfw] START helmfile.d/services/changeprop: sync |
[production] |
| 11:22 |
<elukey@deploy1002> |
helmfile [staging] DONE helmfile.d/services/changeprop: sync |
[production] |
| 11:22 |
<elukey@deploy1002> |
helmfile [staging] START helmfile.d/services/changeprop: sync |
[production] |
| 09:55 |
<elukey@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync |
[production] |
| 09:54 |
<elukey@deploy1002> |
helmfile [eqiad] START helmfile.d/services/eventgate-main: sync |
[production] |
| 09:40 |
<elukey@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync |
[production] |
| 09:40 |
<elukey@deploy1002> |
helmfile [codfw] START helmfile.d/services/eventgate-main: sync |
[production] |
| 09:13 |
<elukey@deploy1002> |
helmfile [staging] DONE helmfile.d/services/eventgate-main: sync |
[production] |
| 09:12 |
<elukey@deploy1002> |
helmfile [staging] START helmfile.d/services/eventgate-main: sync |
[production] |
| 09:11 |
<elukey> |
roll restart of eventgate-main pods in wikikube eqiad/codfw to pick up new stream configs - T328576 |
[production] |
|
2023-02-01
§
|
| 16:23 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . |
[production] |
| 16:22 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
| 14:41 |
<awight@deploy1002> |
elukey and awight: Backport for [[gerrit:884155|wmf-config: add new revision-score streams for EventGate main (T317768)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet |
[production] |
| 10:42 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
| 10:42 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
| 10:42 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 10:42 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 10:41 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 10:41 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 08:27 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
| 08:27 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
| 08:27 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 08:27 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |