2021-06-12
§
|
23:59 |
<Krinkle> |
wm-bot has been migrated to Libera (logging #countervandalism, and serving #cvn-es-scan, #cvn-wp-fa, #cvn-zh-sw and #cvn-zho) |
[cvn] |
23:56 |
<MacFan4000> |
copied channel config for #cvn-es-scan, #cvn-wp-fa and #countervandalism |
[wm-bot] |
19:27 |
<valhallasw> |
Deployed 081c4e1df0cf8c7e3525f12f194197c2fe1b53a7 |
[tools.forrestbot] |
14:42 |
<majavah> |
sync hiera key prometheus_nodes to match tools |
[toolsbeta] |
14:39 |
<majavah> |
remove nonexistent tools-prometheus-04 and add tools-prometheus-05 to hiera key "prometheus_nodes" |
[tools] |
13:53 |
<majavah> |
create empty bullseye-{tools,toolsbeta} repositories on tools-services-05 aptly |
[tools] |
13:49 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: alert noise, no impact, x2 is unused |
[production] |
13:49 |
<rzl@cumin1001> |
START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 6 hosts with reason: alert noise, no impact, x2 is unused |
[production] |
2021-06-11
§
|
23:37 |
<mutante> |
removing firewall hole for mgmt networks to install* because it turned out it cant be used for firmware upgrades |
[production] |
22:08 |
<brennen> |
gitlab.wikimedia.org currently up with recommended config applied; test data deleted; users can register but not create projects. brennen, dancy, and thcipriani currently marked as admins. may need to reset data again, but hopefully not. |
[production] |
21:27 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on pc2014.codfw.wmnet with reason: REIMAGE |
[production] |
21:25 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on pc2014.codfw.wmnet with reason: REIMAGE |
[production] |
21:01 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on pc2013.codfw.wmnet with reason: REIMAGE |
[production] |
20:59 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on pc2013.codfw.wmnet with reason: REIMAGE |
[production] |
20:49 |
<brennen> |
gitlab1001: resetting application data, re-running ansible playbook |
[releng] |
20:04 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on pc2012.codfw.wmnet with reason: REIMAGE |
[production] |
20:02 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on pc2012.codfw.wmnet with reason: REIMAGE |
[production] |
19:27 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on pc2011.codfw.wmnet with reason: REIMAGE |
[production] |
19:25 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on pc2011.codfw.wmnet with reason: REIMAGE |
[production] |
16:40 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1004 |
[production] |
16:40 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1004 |
[production] |
15:50 |
<James_F> |
Zuul: [node-rdkafka-statsd] Switch to service-pipeline-test T284345 |
[releng] |
15:32 |
<wm-bot> |
<jeanfred> Deploy a753cb2 |
[tools.integraality] |
15:29 |
<wm-bot> |
<jeanfred> Deploy 5187a4d |
[tools.integraality] |
15:25 |
<James_F> |
Zuul: [node-rdkafka-factory] Switch to service-pipeline-test T284345 |
[releng] |
15:25 |
<majavah> |
undeploy nginx-ingress-jobs from kubernetes |
[toolsbeta] |
15:01 |
<reedy@deploy1002> |
Synchronized php-1.37.0-wmf.9/extensions/MediaSearch/extension.json: Make MediaSearch default search experience for all users (duration: 00m 57s) |
[production] |
15:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1143 (re)pooling @ 100%: Repool db1143 after upgrade', diff saved to https://phabricator.wikimedia.org/P16432 and previous config saved to /var/cache/conftool/dbconfig/20210611-150018-root.json |
[production] |
14:54 |
<majavah> |
generate and add own root key to passwords::root::extra_keys |
[toolsbeta] |
14:47 |
<majavah> |
generate and add my (taavi) own root key to deployment-prep |
[releng] |
14:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1143 (re)pooling @ 75%: Repool db1143 after upgrade', diff saved to https://phabricator.wikimedia.org/P16431 and previous config saved to /var/cache/conftool/dbconfig/20210611-144514-root.json |
[production] |
14:45 |
<balloons> |
Increase quota to 21 cpu, 42GB ram, 360G disk T284662 |
[logging] |
14:44 |
<mbsantos@deploy1002> |
Finished deploy [tilerator/deploy@6bfdab5]: (no justification provided) (duration: 00m 05s) |
[production] |
14:44 |
<mbsantos@deploy1002> |
Started deploy [tilerator/deploy@6bfdab5]: (no justification provided) |
[production] |
14:43 |
<mbsantos@deploy1002> |
Finished deploy [kartotherian/deploy@5d7c993]: (no justification provided) (duration: 00m 05s) |
[production] |
14:42 |
<mbsantos@deploy1002> |
Started deploy [kartotherian/deploy@5d7c993]: (no justification provided) |
[production] |
14:39 |
<balloons> |
add 16 vcpu, to 178 total T284507 |
[integration] |
14:38 |
<balloons> |
set quota to 16 vcpu, 24G ram T284527 |
[wikisource] |
14:36 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1009 |
[production] |
14:36 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1009 |
[production] |
14:35 |
<jiji@deploy1002> |
helmfile [codfw] DONE helmfile.d/admin 'apply'. |
[production] |
14:35 |
<jiji@deploy1002> |
helmfile [codfw] START helmfile.d/admin 'apply'. |
[production] |
14:34 |
<jiji@deploy1002> |
helmfile [eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
14:34 |
<jiji@deploy1002> |
helmfile [eqiad] START helmfile.d/admin 'apply'. |
[production] |
14:34 |
<hnowlan@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=maps1008.eqiad.wmnet |
[production] |