2021-04-26
§
|
10:22 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ldap-replica1003.wikimedia.org |
[production] |
10:18 |
<moritzm> |
installing systemd updates from buster 10.9 point release |
[production] |
10:07 |
<jmm@cumin2001> |
START - Cookbook sre.ganeti.makevm for new host ldap-replica1003.wikimedia.org |
[production] |
10:00 |
<filippo@cumin1001> |
conftool action : set/pooled=true; selector: dnsdisc=swift,name=eqiad |
[production] |
09:53 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ldap-replica2006.wikimedia.org |
[production] |
09:42 |
<moritzm> |
installing clamav security updates on otrs1001 |
[production] |
09:38 |
<godog> |
reboot ms-be1062, kernel backtrace saved |
[production] |
09:26 |
<filippo@cumin1001> |
conftool action : set/pooled=false; selector: dnsdisc=swift,name=eqiad |
[production] |
09:26 |
<jmm@cumin2001> |
START - Cookbook sre.ganeti.makevm for new host ldap-replica2006.wikimedia.org |
[production] |
09:24 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ldap-replica2005.wikimedia.org |
[production] |
09:15 |
<jayme@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on conf2005.codfw.wmnet with reason: for initial etcd replication |
[production] |
09:15 |
<jayme@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on conf2005.codfw.wmnet with reason: for initial etcd replication |
[production] |
09:13 |
<jayme> |
imported etcd-mirror_0.0.6-2 to buster-wikimedia |
[production] |
09:10 |
<jmm@cumin2001> |
START - Cookbook sre.ganeti.makevm for new host ldap-replica2005.wikimedia.org |
[production] |
09:07 |
<jmm@cumin2001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ldap-replica2005failoid1002.wikimedia.org |
[production] |
09:04 |
<jayme> |
imported etcd-mirror_0.0.6-1 to buster-wikimedia |
[production] |
08:55 |
<jmm@cumin2001> |
START - Cookbook sre.ganeti.makevm for new host ldap-replica2005failoid1002.wikimedia.org |
[production] |
08:49 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: NOOP: f01a6dab70f74938dd51668809a181a8f551b6c8: GrowthExperiments: Enable community configuration on testwiki (T274520) (duration: 00m 57s) |
[production] |
08:42 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: NOOP: 88da8226823e59d1d19db9aeca3b5a5140c0c60c: GrowthExperiments: Do not enable community configuration outside of beta wikis (T274520) (duration: 00m 59s) |
[production] |
08:28 |
<moritzm> |
update debmonitor to 0.2.9 on remaining hosts T281090 |
[production] |
08:13 |
<moritzm> |
installing lxml security updates on stretch |
[production] |
07:54 |
<jayme@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on conf2005.codfw.wmnet with reason: for initial etcd replication |
[production] |
07:54 |
<jayme@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on conf2005.codfw.wmnet with reason: for initial etcd replication |
[production] |
07:53 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe1001.eqiad.wmnet with reason: REIMAGE |
[production] |
07:51 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe1001.eqiad.wmnet with reason: REIMAGE |
[production] |
07:32 |
<godog> |
swift eqiad-prod: less weight for ms-be[1019-1026] / more weight to ms-be106[0-3] - T272836 |
[production] |
07:24 |
<moritzm> |
installing pear security updates |
[production] |
07:09 |
<moritzm> |
removed rawdog from bullseye-wikimedia, needs Py2 T280989 |
[production] |
06:24 |
<elukey> |
reboot an-coord1001 to pick up kernel security settings (after reimage) |
[production] |
05:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Add db1158 to dbctl, depooled, T258361', diff saved to https://phabricator.wikimedia.org/P15521 and previous config saved to /var/cache/conftool/dbconfig/20210426-054700-marostegui.json |
[production] |
05:32 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1124.eqiad.wmnet with reason: REIMAGE |
[production] |
05:30 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1124.eqiad.wmnet with reason: REIMAGE |
[production] |
03:43 |
<kart_> |
Updated cxserver to 2021-04-21-044024-production (T279045) |
[production] |
03:41 |
<kartik@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
03:37 |
<kartik@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
03:32 |
<kartik@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' . |
[production] |
2021-04-23
§
|
21:36 |
<foks> |
removing 1 file for legal compliance |
[production] |
20:15 |
<mutante> |
[apt1001:~] $ sudo -i reprepro -C main includedeb bullseye-wikimedia /home/dzahn/rawdog_2.23-2_all.deb (T280989) |
[production] |
19:41 |
<mutante> |
[apt1001:~] $ sudo -i reprepro copy bullseye-wikimedia buster-wikimedia envoyproxy - copy envoy package from buster to bullseye T280989 |
[production] |
19:09 |
<ebernhardson> |
closing duplicate/wrong cluster indices in cloudelastic |
[production] |
17:02 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=cp1087.eqiad.wmnet |
[production] |
16:35 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
16:32 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
16:24 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
16:19 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
14:59 |
<jbond@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on theemin.codfw.wmnet with reason: REIMAGE |
[production] |
14:59 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on theemin.codfw.wmnet with reason: REIMAGE |
[production] |
14:25 |
<moritzm> |
revert back bullseye image to daily build from last week (to rule out potential reimage issue) |
[production] |