351-400 of 10000 results (12ms)
2020-09-08 ยง
13:16 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
13:16 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
13:16 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'echostore' for release 'production' . [production]
13:16 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'echostore' for release 'staging' . [production]
13:14 <elukey@cumin1001> END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [production]
13:14 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters [production]
13:13 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . [production]
13:12 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'changeprop' for release 'production' . [production]
13:09 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'staging' . [production]
13:09 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' . [production]
13:08 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' . [production]
13:08 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'staging' . [production]
13:04 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' . [production]
13:04 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . [production]
12:47 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [production]
12:35 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after reboot. T261389', diff saved to https://phabricator.wikimedia.org/P12523 and previous config saved to /var/cache/conftool/dbconfig/20200908-123546-kormat.json [production]
12:34 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [production]
12:27 <kormat@cumin1001> dbctl commit (dc=all): 'Rebooting for T261389', diff saved to https://phabricator.wikimedia.org/P12522 and previous config saved to /var/cache/conftool/dbconfig/20200908-122702-kormat.json [production]
12:27 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
12:27 <kormat@cumin1001> START - Cookbook sre.hosts.downtime [production]
12:11 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after reboot. T261389', diff saved to https://phabricator.wikimedia.org/P12521 and previous config saved to /var/cache/conftool/dbconfig/20200908-121139-kormat.json [production]
12:04 <kormat@cumin1001> dbctl commit (dc=all): 'Rebooting for T261389', diff saved to https://phabricator.wikimedia.org/P12520 and previous config saved to /var/cache/conftool/dbconfig/20200908-120419-kormat.json [production]
12:04 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
12:04 <kormat@cumin1001> START - Cookbook sre.hosts.downtime [production]
11:34 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [production]
11:33 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . [production]
11:33 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' . [production]
11:18 <jynus@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
11:15 <jynus@cumin1001> START - Cookbook sre.hosts.downtime [production]
10:53 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [production]
10:53 <marostegui> Deploy schema change on s3 eqiad master - T253276 [production]
10:53 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . [production]
10:53 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' . [production]
10:20 <marostegui> Deploy schema change on s4 eqiad master - T253276 [production]
10:14 <jmm@cumin2001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
10:14 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single [production]
10:11 <jmm@cumin2001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
10:11 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single [production]
10:08 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after reboot. T261389', diff saved to https://phabricator.wikimedia.org/P12519 and previous config saved to /var/cache/conftool/dbconfig/20200908-100852-kormat.json [production]
09:52 <akosiaris> enable puppet, run it on all k8s eqiad nodes and double check that calico-node is fine T239835 [production]
09:43 <akosiaris> stopped calico-node and kube-apiserver on k8s nodes/masters T239835 [production]
09:43 <marostegui> Stop mysql on es2014 to clone es2026 T261717 [production]
09:39 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es2014 - T261717', diff saved to https://phabricator.wikimedia.org/P12517 and previous config saved to /var/cache/conftool/dbconfig/20200908-093957-marostegui.json [production]
09:37 <volans> running homer 'cr*eqiad*' commit "Update debmonitor IPs (#2), T261489" [production]
09:33 <jayme@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:33 <jayme@cumin1001> START - Cookbook sre.hosts.downtime [production]
09:28 <kormat@cumin1001> dbctl commit (dc=all): 'Rebooting for T261389', diff saved to https://phabricator.wikimedia.org/P12515 and previous config saved to /var/cache/conftool/dbconfig/20200908-092755-kormat.json [production]
09:27 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:27 <kormat@cumin1001> START - Cookbook sre.hosts.downtime [production]
09:20 <jayme> disabling puppted on argon.eqiad.wmnet,chlorine.eqiad.wmnet,kubernetes[1001-1016].eqiad.wmnet - Reinitialize eqiad k8s cluster with new etcd - T239835 [production]