2022-08-11
ยง
|
20:26 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
20:25 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
20:23 |
<mutante> |
merging change on prod phabricator host to allow scap deployment, part 1 |
[production] |
19:42 |
<damilare> |
payments-wiki upgraded from cf5e1848 to 0894d75a |
[production] |
19:41 |
<mutante> |
disabling puppet on C:profile::phabricator::main |
[production] |
19:20 |
<mvernon@cumin2002> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: upgrade to 3.11.13 T309896 - mvernon@cumin2002 |
[production] |
17:58 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
17:58 |
<taavi@deploy1002> |
Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:822428|Fix labtestwiki database name servers (T310795)]] (duration: 03m 39s) |
[production] |
17:57 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
17:57 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
17:56 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
17:52 |
<sukhe> |
testing ATS 9.1.3-1wm1 on cp3064: T309651 |
[production] |
17:49 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host netmon2002.mgmt.codfw.wmnet with reboot policy FORCED |
[production] |
17:46 |
<sukhe> |
testing ATS 9.1.3-1wm1 on cp3064: T3096515 |
[production] |
17:41 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.provision for host netmon2002.mgmt.codfw.wmnet with reboot policy FORCED |
[production] |
17:40 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:38 |
<sukhe> |
testing ATS 9.1.3-1wm1 on cp1090: T309651 |
[production] |
17:36 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
17:35 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host netmon2002 |
[production] |
17:34 |
<pt1979@cumin2002> |
START - Cookbook sre.network.configure-switch-interfaces for host netmon2002 |
[production] |
17:33 |
<sukhe> |
testing ATS 9.1.3-1wm1 on cp3065: T309651 |
[production] |
17:28 |
<sukhe> |
testing ATS 9.1.3-1wm1 on cp1089: T309651 |
[production] |
17:18 |
<bking@cumin1001> |
conftool action : set/weight=10:pooled=no; selector: service=elasticsearch-omega-ssl,name=elastic1100.eqiad.wmnet |
[production] |
17:18 |
<bking@cumin1001> |
conftool action : set/weight=10:pooled=yes; selector: service=elasticsearch-omega-ssl,name=elastic1100.eqiad.wmnet |
[production] |
17:15 |
<bking@cumin1001> |
conftool action : set/weight=10:pooled=yes; selector: service=search-omega-https,name=elastic1100.eqiad.wmnet |
[production] |
16:35 |
<mvernon@cumin2002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: upgrade to 3.11.13 T309896 - mvernon@cumin2002 |
[production] |
16:30 |
<mvernon@cumin2002> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: upgrade to 3.11.13 T309896 - mvernon@cumin2002 |
[production] |
16:29 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic[1100-1102].eqiad.wmnet with reason: T309810 |
[production] |
16:29 |
<bking@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic[1100-1102].eqiad.wmnet with reason: T309810 |
[production] |
16:26 |
<inflatador> |
bking@elastic1054 attempting to ban elastic1100-1102 from cluster due to firewall issues |
[production] |
16:13 |
<bking@cumin1001> |
conftool action : set/weight=10:pooled=yes; selector: service=search-omega-https,name=elastic1100.eqiad.wmnet |
[production] |
16:12 |
<bking@cumin1001> |
conftool action : set/weight=10:pooled=yes; selector: name=elastic1100 |
[production] |
15:15 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:09 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
14:58 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'db1162 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P32364 and previous config saved to /var/cache/conftool/dbconfig/20220811-145823-ladsgroup.json |
[production] |
14:55 |
<inflatador> |
bking@cumin1001 running puppet agent across eqiad elastic hosts |
[production] |
14:48 |
<ryankemper@cumin1001> |
END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reimage (bullseye upgrade) - ryankemper@cumin1001 - T289135 |
[production] |
14:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'db1162 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P32362 and previous config saved to /var/cache/conftool/dbconfig/20220811-144318-ladsgroup.json |
[production] |
14:28 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'db1162 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P32361 and previous config saved to /var/cache/conftool/dbconfig/20220811-142813-ladsgroup.json |
[production] |
14:28 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1003.wikimedia.org |
[production] |
14:28 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
14:24 |
<andrew@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
14:19 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts cloudcontrol1003.wikimedia.org |
[production] |
14:19 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
14:18 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1004.wikimedia.org |
[production] |
14:18 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
14:18 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
14:18 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
14:17 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
14:17 |
<ladsgroup@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:822375|Stop writing to the old templatelinks fields in s2 (T312865)]] (duration: 03m 25s) |
[production] |