2020-09-08
ยง
|
12:04 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
12:04 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
11:34 |
<akosiaris@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
11:33 |
<akosiaris@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . |
[production] |
11:33 |
<akosiaris@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' . |
[production] |
11:18 |
<jynus@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
11:15 |
<jynus@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
10:53 |
<akosiaris@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
10:53 |
<marostegui> |
Deploy schema change on s3 eqiad master - T253276 |
[production] |
10:53 |
<akosiaris@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . |
[production] |
10:53 |
<akosiaris@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' . |
[production] |
10:20 |
<marostegui> |
Deploy schema change on s4 eqiad master - T253276 |
[production] |
10:14 |
<jmm@cumin2001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) |
[production] |
10:14 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
10:11 |
<jmm@cumin2001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) |
[production] |
10:11 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
10:08 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after reboot. T261389', diff saved to https://phabricator.wikimedia.org/P12519 and previous config saved to /var/cache/conftool/dbconfig/20200908-100852-kormat.json |
[production] |
09:52 |
<akosiaris> |
enable puppet, run it on all k8s eqiad nodes and double check that calico-node is fine T239835 |
[production] |
09:43 |
<akosiaris> |
stopped calico-node and kube-apiserver on k8s nodes/masters T239835 |
[production] |
09:43 |
<marostegui> |
Stop mysql on es2014 to clone es2026 T261717 |
[production] |
09:39 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es2014 - T261717', diff saved to https://phabricator.wikimedia.org/P12517 and previous config saved to /var/cache/conftool/dbconfig/20200908-093957-marostegui.json |
[production] |
09:37 |
<volans> |
running homer 'cr*eqiad*' commit "Update debmonitor IPs (#2), T261489" |
[production] |
09:33 |
<jayme@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
09:33 |
<jayme@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
09:28 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Rebooting for T261389', diff saved to https://phabricator.wikimedia.org/P12515 and previous config saved to /var/cache/conftool/dbconfig/20200908-092755-kormat.json |
[production] |
09:27 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
09:27 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
09:20 |
<jayme> |
disabling puppted on argon.eqiad.wmnet,chlorine.eqiad.wmnet,kubernetes[1001-1016].eqiad.wmnet - Reinitialize eqiad k8s cluster with new etcd - T239835 |
[production] |
08:55 |
<marostegui> |
Deploy schema change on s7 eqiad master - T253276 |
[production] |
08:48 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Reduce db2127's weight', diff saved to https://phabricator.wikimedia.org/P12514 and previous config saved to /var/cache/conftool/dbconfig/20200908-084834-marostegui.json |
[production] |
08:45 |
<volans> |
running homer 'cr*eqiad*' commit "Update debmonitor IPs, T261489" |
[production] |
08:23 |
<akosiaris@cumin1001> |
conftool action : set/pooled=false; selector: dnsdisc=blubberoid,name=eqiad |
[production] |
08:22 |
<oblivian@cumin1001> |
conftool action : set/pooled=false; selector: dnsdisc=restbase-async,name=eqiad |
[production] |
08:21 |
<oblivian@cumin1001> |
conftool action : set/pooled=true; selector: dnsdisc=restbase-async,name=codfw |
[production] |
08:20 |
<oblivian@cumin1001> |
conftool action : set/pooled=false; selector: dnsdisc=eventgate-main,name=eqiad |
[production] |
08:16 |
<moritzm> |
installing 4.19.132 kernel on buster systems (only installing the deb, reboots separately) |
[production] |
07:44 |
<urbanecm@deploy1001> |
Synchronized private/PrivateSettings.php: Revert "Update T250887 mitigations" (T250887; T262242) (duration: 00m 59s) |
[production] |
07:44 |
<elukey> |
roll restart kafka daemons on kafka-jumbo100[7-9] to pick up opendjk upgrades |
[production] |
07:40 |
<XioNoX> |
move HE from ix to transit BGP group on cr3-eqsin |
[production] |
07:00 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . |
[production] |
06:58 |
<marostegui> |
Deploy schema change on s2 eqiad master - T253276 |
[production] |
06:58 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . |
[production] |
06:56 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . |
[production] |
06:50 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1106 for PDU maintenance', diff saved to https://phabricator.wikimedia.org/P12513 and previous config saved to /var/cache/conftool/dbconfig/20200908-065022-marostegui.json |
[production] |
06:47 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . |
[production] |
06:31 |
<marostegui> |
Deploy schema change on s5 eqiad master - T253276 |
[production] |
06:23 |
<elukey> |
roll restart of Hadoop master daemons on an-master100[1,2] to pick up new opejdk settings |
[production] |
06:14 |
<marostegui> |
Stop MySQL on db1106 for PDU maintenance T261452 |
[production] |
05:34 |
<marostegui@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
05:32 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |