4301-4350 of 10000 results (26ms)
2023-02-10 §
15:56 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ml-staging[2001-2002].codfw.wmnet,ml-staging-ctrl[2001-2002].codfw.wmnet,ml-staging-etcd2003.codfw.wmnet with reason: Cluster half broken, in the middle of upgrading [production]
15:49 <elukey@cumin1001> END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
15:49 <elukey@cumin1001> END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ml-staging-etcd2002.codfw.wmnet with OS bullseye [production]
14:53 <elukey@cumin1001> START - Cookbook sre.ganeti.reimage for host ml-staging-etcd2002.codfw.wmnet with OS bullseye [production]
14:52 <elukey@cumin1001> START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
14:43 <elukey@cumin1001> END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
14:36 <elukey@cumin1001> END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ml-staging-etcd2001.codfw.wmnet with OS bullseye [production]
14:36 <elukey@cumin1001> START - Cookbook sre.ganeti.reimage for host ml-staging-etcd2001.codfw.wmnet with OS bullseye [production]
14:33 <elukey@cumin1001> START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
13:49 <elukey@cumin1001> END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
13:48 <elukey@cumin1001> START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
07:59 <elukey@cumin1001> END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
07:59 <elukey@cumin1001> START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
07:43 <elukey@cumin1001> END (FAIL) - Cookbook sre.k8s.upgrade-cluster (exit_code=99) Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
07:41 <elukey@cumin1001> START - Cookbook sre.k8s.upgrade-cluster Upgrade K8s version: Upgrade ml-staging-codfw cluster to 1.23 [production]
2023-02-09 §
13:40 <elukey> restart prometheus-statsd-exporter on ores nodes to pick up label change - T325763 [production]
2023-02-07 §
11:52 <elukey@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons. [production]
10:51 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons. [production]
10:49 <elukey@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons. [production]
09:08 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons. [production]
2023-02-06 §
09:05 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . [production]
09:05 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . [production]
09:04 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
09:04 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
2023-02-02 §
17:12 <elukey@deploy1002> helmfile [eqiad] DONE helmfile.d/services/changeprop: sync [production]
17:12 <elukey@deploy1002> helmfile [eqiad] START helmfile.d/services/changeprop: sync [production]
16:47 <elukey@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop: sync [production]
16:46 <elukey@deploy1002> helmfile [codfw] START helmfile.d/services/changeprop: sync [production]
11:22 <elukey@deploy1002> helmfile [staging] DONE helmfile.d/services/changeprop: sync [production]
11:22 <elukey@deploy1002> helmfile [staging] START helmfile.d/services/changeprop: sync [production]
09:55 <elukey@deploy1002> helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync [production]
09:54 <elukey@deploy1002> helmfile [eqiad] START helmfile.d/services/eventgate-main: sync [production]
09:40 <elukey@deploy1002> helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync [production]
09:40 <elukey@deploy1002> helmfile [codfw] START helmfile.d/services/eventgate-main: sync [production]
09:13 <elukey@deploy1002> helmfile [staging] DONE helmfile.d/services/eventgate-main: sync [production]
09:12 <elukey@deploy1002> helmfile [staging] START helmfile.d/services/eventgate-main: sync [production]
09:11 <elukey> roll restart of eventgate-main pods in wikikube eqiad/codfw to pick up new stream configs - T328576 [production]
2023-02-01 §
16:23 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . [production]
16:22 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
14:41 <awight@deploy1002> elukey and awight: Backport for [[gerrit:884155|wmf-config: add new revision-score streams for EventGate main (T317768)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
10:42 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
10:42 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
10:42 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
10:42 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
10:41 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
10:41 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
08:27 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
08:27 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
08:27 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
08:27 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]