2022-03-01
ยง
|
13:49 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1087.eqiad.wmnet with reason: host reimage |
[production] |
13:48 |
<klausman@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
13:48 |
<klausman@cumin2002> |
START - Cookbook sre.ganeti.makevm for new host ml-staging-etcd2002.codfw.wmnet |
[production] |
13:48 |
<klausman@cumin2002> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ml-staging-etcd2002.codfw.wmnet |
[production] |
13:48 |
<klausman@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
13:47 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cp1087.eqiad.wmnet with reason: host reimage |
[production] |
13:44 |
<klausman@cumin2002> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ml-staging-etcd2003.codfw.wmnet |
[production] |
13:43 |
<klausman@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
13:43 |
<klausman@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
13:43 |
<klausman@cumin2002> |
END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) |
[production] |
13:40 |
<kormat> |
Deploying wmfmariadbpy 0.9 T302796 |
[production] |
13:40 |
<kormat> |
uploaded wmfmariadbpy 0.9 to apt.wm.o T302796 |
[production] |
13:39 |
<klausman@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
13:39 |
<klausman@cumin2002> |
END (ERROR) - Cookbook sre.dns.netbox (exit_code=97) |
[production] |
13:39 |
<klausman@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
13:39 |
<klausman@cumin2002> |
START - Cookbook sre.ganeti.makevm for new host ml-staging-etcd2003.codfw.wmnet |
[production] |
13:39 |
<klausman@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
13:39 |
<klausman@cumin2002> |
START - Cookbook sre.ganeti.makevm for new host ml-staging-etcd2002.codfw.wmnet |
[production] |
13:32 |
<moritzm> |
restarting nginx on registry* nodes to pick up expat update |
[production] |
13:31 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reimage for host cp1087.eqiad.wmnet with OS buster |
[production] |
13:15 |
<XioNoX> |
restart cr1-drmrs for software upgrade |
[production] |
13:03 |
<moritzm> |
restarting FPM/Apache on parsoid hosts to pick up expat update |
[production] |
12:49 |
<vgutierrez> |
pool cp3062 running HAProxy as TLS termination layer - T290005 T271421 |
[production] |
12:47 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3062.esams.wmnet with OS buster |
[production] |
12:39 |
<moritzm> |
installing expat security updates |
[production] |
12:34 |
<mmandere> |
restart purged on cp60[12-14] |
[production] |
12:32 |
<jgiannelos@deploy1002> |
Finished deploy [kartotherian/deploy@41d2498] (eqiad): Reduce pool size to 1 connection per node worker (duration: 01m 06s) |
[production] |
12:31 |
<jgiannelos@deploy1002> |
Started deploy [kartotherian/deploy@41d2498] (eqiad): Reduce pool size to 1 connection per node worker |
[production] |
12:30 |
<jgiannelos@deploy1002> |
Finished deploy [kartotherian/deploy@41d2498] (codfw): Reduce pool size to 1 connection per node worker (duration: 01m 30s) |
[production] |
12:28 |
<jgiannelos@deploy1002> |
Started deploy [kartotherian/deploy@41d2498] (codfw): Reduce pool size to 1 connection per node worker |
[production] |
12:15 |
<jgiannelos@deploy1002> |
Finished deploy [kartotherian/deploy@51d5a07] (codfw): Fix pool size configuration (duration: 01m 41s) |
[production] |
12:13 |
<jgiannelos@deploy1002> |
Started deploy [kartotherian/deploy@51d5a07] (codfw): Fix pool size configuration |
[production] |
12:11 |
<jgiannelos@deploy1002> |
Finished deploy [kartotherian/deploy@51d5a07] (eqiad): Fix pool size configuration (duration: 02m 01s) |
[production] |
12:09 |
<jgiannelos@deploy1002> |
Started deploy [kartotherian/deploy@51d5a07] (eqiad): Fix pool size configuration |
[production] |
11:43 |
<klausman@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
11:36 |
<kharlan@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply |
[production] |
11:35 |
<klausman@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
11:35 |
<klausman@cumin2002> |
START - Cookbook sre.ganeti.makevm for new host ml-staging-etcd2001.codfw.wmnet |
[production] |
11:33 |
<kharlan@deploy1002> |
helmfile [codfw] START helmfile.d/services/linkrecommendation: apply |
[production] |
11:32 |
<kharlan@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply |
[production] |
11:30 |
<kharlan@deploy1002> |
helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply |
[production] |
11:28 |
<kharlan@deploy1002> |
helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply |
[production] |
11:27 |
<kharlan@deploy1002> |
helmfile [staging] START helmfile.d/services/linkrecommendation: apply |
[production] |
11:27 |
<cmooney@cumin1001> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host an-worker1148.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
11:21 |
<_joe_> |
restarted pybal, removed ipvsadm entry on lvs1019. Now all of MediaWiki has no http LVS endpoint available.T244843 |
[production] |
11:18 |
<_joe_> |
also removed the ipvsadm entry for apaches:80 T244843 |
[production] |
11:17 |
<jayme> |
rolled back linkrecommendation staging helm release to revision 12 - T302744 |
[production] |
11:17 |
<_joe_> |
restarting pybal on lvs1020 T244843 |
[production] |
11:11 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3062.esams.wmnet with reason: host reimage |
[production] |
11:11 |
<_joe_> |
restarted pybal on lvs2009, T244843 |
[production] |