4701-4750 of 10000 results (26ms)
2022-10-10 §
07:26 <elukey> kill hanging process for user bmansurov on deploy1002 to allow proper user cleanup [production]
2022-10-07 §
09:26 <elukey> delete calico pods in CrashLoop on dse-k8s-codfw (probably due to the incorrect docker settings) [production]
07:54 <elukey> re-initialize docker on dse-k8s-worker1004 - wrong storage type set (devicemapper instead of overlay2) [production]
07:49 <elukey> re-initialize docker on dse-k8s-worker100[5-8] - wrong storage type set (devicemapper instead of overlay2) [production]
2022-10-06 §
15:53 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
15:52 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . [production]
15:52 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . [production]
15:51 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . [production]
15:51 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [production]
15:49 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . [production]
15:45 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
15:44 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
15:22 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
15:21 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
15:08 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
15:08 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
13:41 <elukey@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons. [production]
13:14 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
13:14 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
12:40 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons. [production]
10:30 <elukey> restart kafka on kafka-logging1003 to reload the conifg (cleanup old super.users related to past keystore) [production]
08:10 <elukey> restart kafka on kafka-logging1002 to reload the conifg (cleanup old super.users related to past keystore) [production]
08:09 <elukey> kafka logging old cert cleanup - `cumin 'A:kafka-logging' 'rm -f /etc/kafka/ssl/kafka_logging-eqiad_broker.keystore.jks'` [production]
08:00 <elukey> delete /etc/kafka/ssl/kafka_logging-eqiad_broker.keystore.jks on kafka-logging1001 and restart (old puppet cert + settings deleted) [production]
2022-10-05 §
14:07 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:06 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
07:54 <elukey> restart kafka on kafka-logging1003 to pick up new PKI TLS settings [production]
07:50 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on kafka-logging1003.eqiad.wmnet with reason: Kafka PKI upgrade [production]
07:49 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 0:20:00 on kafka-logging1003.eqiad.wmnet with reason: Kafka PKI upgrade [production]
06:30 <elukey> restart kafka on kafka-logging1002 to pick up the new cert+settings for PKI [production]
06:27 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on kafka-logging1002.eqiad.wmnet with reason: Kafka PKI upgrade [production]
06:27 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 0:20:00 on kafka-logging1002.eqiad.wmnet with reason: Kafka PKI upgrade [production]
2022-10-04 §
15:25 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
15:25 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
13:14 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
13:13 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
07:36 <elukey@deploy1002> helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync [production]
07:36 <elukey@deploy1002> helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync [production]
07:16 <elukey> restart kafka on kafka-logging1001 to pick up its new PKI TLS cert [production]
07:11 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on kafka-logging1001.eqiad.wmnet with reason: Kafka PKI upgrade [production]
07:11 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 0:20:00 on kafka-logging1001.eqiad.wmnet with reason: Kafka PKI upgrade [production]
2022-10-02 §
08:13 <elukey> `apt-get clean` on an-airflow1001 to free some space on the root partition [production]
2022-09-30 §
13:23 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
13:23 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
13:23 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
13:22 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
13:22 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
13:22 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
07:27 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
07:27 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]