2020-12-17
ยง
|
13:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1082 (re)pooling @ 75%: Repooling after cloning db1154:3315 as sanitarium T268742', diff saved to https://phabricator.wikimedia.org/P13574 and previous config saved to /var/cache/conftool/dbconfig/20201217-131059-root.json |
[production] |
13:01 |
<marostegui> |
Stop mysql on db1087 to clone db1154 |
[production] |
13:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1087 to clone db1154:3318 add db1092 as vslow,dump service for s8 T268742 ', diff saved to https://phabricator.wikimedia.org/P13571 and previous config saved to /var/cache/conftool/dbconfig/20201217-130101-marostegui.json |
[production] |
12:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1089 (re)pooling @ 25%: Repool db1089 after helping out on db1106', diff saved to https://phabricator.wikimedia.org/P13570 and previous config saved to /var/cache/conftool/dbconfig/20201217-125624-root.json |
[production] |
12:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1082 (re)pooling @ 50%: Repooling after cloning db1154:3315 as sanitarium T268742', diff saved to https://phabricator.wikimedia.org/P13569 and previous config saved to /var/cache/conftool/dbconfig/20201217-125556-root.json |
[production] |
12:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Change db1089 weights', diff saved to https://phabricator.wikimedia.org/P13568 and previous config saved to /var/cache/conftool/dbconfig/20201217-125535-marostegui.json |
[production] |
12:54 |
<arturo> |
joining new etcd nodes in the k8s etcd cluster (T267966) |
[tools] |
12:54 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1106 after cloning db1154:3311 as sanitarium T268742', diff saved to https://phabricator.wikimedia.org/P13567 and previous config saved to /var/cache/conftool/dbconfig/20201217-125446-marostegui.json |
[production] |
12:52 |
<arturo> |
adding more etcd nodes in the hiera key in tools-k8s-etcd puppet prefix https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/b4f60768078eccdabdfab4cd99c7c57076de51b2 |
[tools] |
12:50 |
<arturo> |
dropping more unused hiera keys in the tools-k8s-etcd puppet prefix https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/e9e66a6787d9b91c08cf4742a27b90b3e6d05aac |
[tools] |
12:49 |
<arturo> |
dropping unused hiera keys in the tools-k8s-etcd puppet prefix https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/2b4cb4a41756e602fb0996e7d0210e9102172424 |
[tools] |
12:40 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1082 (re)pooling @ 25%: Repooling after cloning db1154:3315 as sanitarium T268742', diff saved to https://phabricator.wikimedia.org/P13566 and previous config saved to /var/cache/conftool/dbconfig/20201217-124052-root.json |
[production] |
12:36 |
<jbond42> |
disable puppet fleet wide for condif master vhost change |
[production] |
12:23 |
<matthiasmullie> |
EU backport+config window done |
[production] |
12:23 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-coord1001.eqiad.wmnet with reason: REIMAGE |
[production] |
12:22 |
<mlitn@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: f3a50cb06: Enable ContentTranslation as default tool for ceb, km, mg, tg and yi WPs (duration: 01m 02s) |
[production] |
12:21 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-coord1001.eqiad.wmnet with reason: REIMAGE |
[production] |
12:17 |
<mlitn@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: a29fec312: Add Wikidocumentaries campaign for ContentTranslation (duration: 01m 02s) |
[production] |
12:16 |
<arturo> |
created VM `tools-k8s-etcd-8` (T267966) |
[tools] |
12:15 |
<arturo> |
created VM `tools-k8s-etcd-7` (T267966) |
[tools] |
12:13 |
<arturo> |
created `tools-k8s-etcd` anti-affinity server group |
[tools] |
12:07 |
<mlitn@deploy1001> |
Synchronized wmf-config/SearchSettingsForSDC.php: 68ac6fa61: Media Search: Remove license map from config (duration: 01m 04s) |
[production] |
11:38 |
<kart_> |
Updated cxserver to 2020-12-17-111820-production (T262192) |
[production] |
11:36 |
<kartik@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
11:34 |
<kartik@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
11:32 |
<kartik@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' . |
[production] |
11:27 |
<godog> |
bounce apache2 on grafana1002 |
[production] |
11:26 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on an-test-worker1003.eqiad.wmnet with reason: REIMAGE |
[production] |
11:24 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-worker1001.eqiad.wmnet with reason: REIMAGE |
[production] |
11:22 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-worker1002.eqiad.wmnet with reason: REIMAGE |
[production] |
11:21 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-worker1001.eqiad.wmnet with reason: REIMAGE |
[production] |
11:21 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-worker1003.eqiad.wmnet with reason: REIMAGE |
[production] |
11:20 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-worker1002.eqiad.wmnet with reason: REIMAGE |
[production] |
11:20 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-master1001.eqiad.wmnet with reason: REIMAGE |
[production] |
11:18 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-master1002.eqiad.wmnet with reason: REIMAGE |
[production] |
11:16 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-master1001.eqiad.wmnet with reason: REIMAGE |
[production] |
11:16 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-master1002.eqiad.wmnet with reason: REIMAGE |
[production] |
11:10 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
11:08 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
11:04 |
<elukey> |
wipe/reimage the hadoop test cluster to start clean for CDH (and then test the upgrade to bigtop 1.5) |
[analytics] |
10:50 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 |
[production] |
10:45 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.stop-cluster for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 |
[production] |
10:21 |
<jbond42> |
updating RemoteIP on phabricator https://gerrit.wikimedia.org/r/c/operations/puppet/+/649872 |
[production] |
09:57 |
<vgutierrez> |
repool ats-tls on cp5011 |
[production] |
09:00 |
<marostegui> |
Sanitize s1 and s5 on db1154 T268742 |
[production] |
08:30 |
<godog> |
swift codfw-prod: more weight to ms-be20[58-61] - T269337 |
[production] |
07:49 |
<ryankemper> |
[wdqs deploy] (wdqs deploy complete) |
[production] |
07:19 |
<marostegui> |
Stop mysql on db1082 to clone db1154 |
[production] |
07:19 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1082 for cloning db1154:3315 T268742 ', diff saved to https://phabricator.wikimedia.org/P13563 and previous config saved to /var/cache/conftool/dbconfig/20201217-071903-marostegui.json |
[production] |
07:18 |
<elukey> |
reboot an-airflow1001 for kernel upgrades |
[production] |