2021-04-27
ยง
|
19:37 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2004.codfw.wmnet with reason: REIMAGE |
[production] |
19:35 |
<papaul> |
powerdown ms-backup2001 for maintenance |
[production] |
19:35 |
<herron@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2004.codfw.wmnet with reason: REIMAGE |
[production] |
19:07 |
<papaul> |
powerdown logstash2035 for maintenance |
[production] |
19:03 |
<dzahn@cumin1001> |
START - Cookbook sre.ganeti.makevm for new host people1003.eqiad.wmnet |
[production] |
19:00 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts people1003.eqiad.wmnet |
[production] |
18:50 |
<mutante> |
people1003 - destroying VM and recreating again from scratch to test if issue of no console and no access is repeatable |
[production] |
18:50 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts people1003.eqiad.wmnet |
[production] |
18:37 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1005.eqiad.wmnet with reason: REIMAGE |
[production] |
18:35 |
<herron@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1005.eqiad.wmnet with reason: REIMAGE |
[production] |
18:33 |
<mutante> |
people1003 - rebooting, trying to get new VM to work |
[production] |
18:33 |
<Urbanecm> |
Morning B&C window done |
[production] |
18:32 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 91a85f2: ac770bf: Enable language in header for office and testwiki users (T280526) (duration: 01m 19s) |
[production] |
18:32 |
<bblack> |
lvs2009 - restart pybal + re-run puppet agent - T279457 |
[production] |
18:23 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
18:20 |
<bblack@cumin1001> |
conftool action : set/pooled=yes; selector: name=cp203[56].codfw.wmnet |
[production] |
18:20 |
<bblack> |
cp203[56] - repooling in etcd - T279457 |
[production] |
18:19 |
<robh@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
18:17 |
<robh@cumin1001> |
END (ERROR) - Cookbook sre.dns.netbox (exit_code=97) |
[production] |
18:17 |
<robh@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
18:16 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) |
[production] |
18:12 |
<robh@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
18:11 |
<bblack> |
dns2001 - restarting bird to repool, then re-enabling puppet - T279457 |
[production] |
18:04 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) |
[production] |
18:02 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
18:02 |
<ejegg> |
update payments-wiki from 9a4eef1375 to 44570561f2 |
[production] |
18:00 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1004.eqiad.wmnet with reason: REIMAGE |
[production] |
17:58 |
<herron@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1004.eqiad.wmnet with reason: REIMAGE |
[production] |
17:34 |
<papaul> |
powerdown moss-fe2001 for maintenance |
[production] |
17:32 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) |
[production] |
17:29 |
<robh@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
17:25 |
<mbsantos@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
17:23 |
<mbsantos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
17:21 |
<mbsantos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . |
[production] |
17:19 |
<ryankemper> |
T281215 Banned `elastic2043` from codfw cirrussearch cluster |
[production] |
17:16 |
<mbsantos@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:14 |
<papaul> |
powerdown kafka-logging2003 for maintenance |
[production] |
17:14 |
<mbsantos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:10 |
<mbsantos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:09 |
<mbsantos@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . |
[production] |
17:07 |
<mbsantos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . |
[production] |
17:04 |
<mbsantos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' . |
[production] |
16:52 |
<papaul> |
powerdown elastic2045 for maintenance |
[production] |
16:49 |
<papaul> |
powerdown ms-be2042 for maintenance |
[production] |
16:39 |
<dcaro> |
reprepro updating packages on thirdparty/ceph-nautilus-buster |
[production] |
16:34 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) |
[production] |
16:29 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
16:23 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 39 hosts with reason: upgrading openstack |
[production] |
16:23 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 39 hosts with reason: upgrading openstack |
[production] |
16:22 |
<effie> |
upgrading scap 3.17.1-1 on mediawiki canaries - T279695 |
[production] |