5701-5750 of 10000 results (15ms)
2022-02-21 §
09:38 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage [production]
09:34 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage1003.eqiad.wmnet with OS bullseye [production]
09:24 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: host reimage [production]
09:22 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS bullseye [production]
09:22 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-staging2001.codfw.wmnet with OS bullseye [production]
09:22 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS bullseye [production]
09:20 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: host reimage [production]
09:04 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubestage1003.eqiad.wmnet with OS bullseye [production]
08:38 <elukey@puppetmaster1001> conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes-staging,service=kubesvc [production]
08:21 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2002.codfw.wmnet with OS bullseye [production]
08:09 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2002.codfw.wmnet with reason: host reimage [production]
08:07 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2002.codfw.wmnet with reason: host reimage [production]
07:48 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubestage2002.codfw.wmnet with OS bullseye [production]
07:11 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . [production]
07:10 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . [production]
2022-02-18 §
07:57 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
07:57 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
07:57 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
07:57 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
07:42 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
07:42 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
07:41 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
07:41 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
2022-02-17 §
17:19 <elukey@puppetmaster1001> conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes-staging,service=kubesvc [production]
17:19 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2001.codfw.wmnet with OS bullseye [production]
16:42 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage [production]
16:39 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage [production]
16:20 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubestage2001.codfw.wmnet with OS bullseye [production]
2022-02-16 §
09:52 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
09:50 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
09:45 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
09:44 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
2022-02-15 §
16:56 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . [production]
16:55 <elukey@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . [production]
15:09 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye [production]
14:37 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye [production]
09:04 <elukey@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ml-serve2008.codfw.wmnet [production]
09:04 <elukey@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ml-serve2007.codfw.wmnet [production]
2022-02-14 §
08:58 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2008.codfw.wmnet with OS bullseye [production]
08:29 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-serve2008.codfw.wmnet with OS bullseye [production]
08:13 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2007.codfw.wmnet with OS bullseye [production]
07:43 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS bullseye [production]
2022-02-12 §
08:49 <elukey> truncate /var/log/auth.log to 1g on krb1001 to free space on root partition (original log saved under /srv) [production]
2022-02-10 §
14:23 <elukey@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ml-serve2006.codfw.wmnet [production]
14:19 <elukey@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ml-serve2006.codfw.wmnet [production]
14:19 <elukey@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ml-serve2005.codfw.wmnet [production]
14:10 <elukey> `elukey@cumin1001:~$ homer 'cr*codfw*' commit "Add ml-serve2006 to the k8s ml-serve-codfw cluster's neighbors"` [production]
09:43 <elukey> update pcc facts [production]
07:16 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2006.codfw.wmnet with OS bullseye [production]
06:46 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-serve2006.codfw.wmnet with OS bullseye [production]