2020-09-22
§
|
08:20 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db2076 (re)pooling @ 25%: schema change T259831', diff saved to https://phabricator.wikimedia.org/P12714 and previous config saved to /var/cache/conftool/dbconfig/20200922-082010-kormat.json |
[production] |
08:13 |
<kormat> |
uploaded wmfmariadbpy v0.5 to apt. deploying now to fleet |
[production] |
08:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool es2032, es2033 and es2034 for the first time with minimal weight T261717', diff saved to https://phabricator.wikimedia.org/P12713 and previous config saved to /var/cache/conftool/dbconfig/20200922-081154-marostegui.json |
[production] |
07:57 |
<volans> |
migrating ulsfo private DNS records to the Netbox-generated ones - T258729 |
[production] |
07:54 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'db2076 depooling: schema change T259831', diff saved to https://phabricator.wikimedia.org/P12712 and previous config saved to /var/cache/conftool/dbconfig/20200922-075429-kormat.json |
[production] |
07:51 |
<jayme> |
running ipvsadm -D -t 10.2.1.18:8080; ipvsadm -D -t 10.2.1.46:3030 on lvs2010.codfw.wmnet,lvs2009.codfw.wmnet - T255879 T254581 |
[production] |
07:49 |
<jayme> |
running ipvsadm -D -t 10.2.2.18:8080; ipvsadm -D -t 10.2.2.46:3030 on lvs1016.eqiad.wmnet,lvs1015.eqiad.wmnet - T255879 T254581 |
[production] |
07:46 |
<jayme> |
restarting pybal on lvs1015.eqiad.wmnet,lvs2009.codfw.wmnet - T255879 T254581 |
[production] |
07:42 |
<jayme> |
restarting pybal on lvs1016.eqiad.wmnet,lvs2010.codfw.wmnet - T255879 T254581 |
[production] |
07:39 |
<jayme> |
running puppet on lvs servers - T255879 T254581 |
[production] |
07:34 |
<volans> |
depooling ulsfo to merge DNS migration to Netbox zonefiles - T258729 |
[production] |
07:24 |
<marostegui> |
Stop MySQL on es2014 - host will be decommissioned T262889 |
[production] |
07:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove es2014 from dbctl T262889', diff saved to https://phabricator.wikimedia.org/P12711 and previous config saved to /var/cache/conftool/dbconfig/20200922-071435-marostegui.json |
[production] |
07:11 |
<XioNoX> |
cr1-codfw# run clear bfd session address fe80::f27c:c7ff:fe11:2c1b |
[production] |
06:18 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es2014 for decommissioning T262889', diff saved to https://phabricator.wikimedia.org/P12710 and previous config saved to /var/cache/conftool/dbconfig/20200922-061815-marostegui.json |
[production] |
05:44 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2019 (re)pooling @ 100%: Slowly repool after recloning es2034 T261717 ', diff saved to https://phabricator.wikimedia.org/P12709 and previous config saved to /var/cache/conftool/dbconfig/20200922-054455-root.json |
[production] |
05:44 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2016 (re)pooling @ 100%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12708 and previous config saved to /var/cache/conftool/dbconfig/20200922-054438-root.json |
[production] |
05:44 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2013 (re)pooling @ 100%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12707 and previous config saved to /var/cache/conftool/dbconfig/20200922-054430-root.json |
[production] |
05:40 |
<marostegui> |
Log remove triggers on revision table on db1124:3313 T238966 |
[production] |
05:39 |
<marostegui> |
Deploy MCR schema change on s3 eqiad, this will generate lag on s3 on labsdb T238966 |
[production] |
05:33 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Add es2032, es2033 and es2034 into dbctl T261717', diff saved to https://phabricator.wikimedia.org/P12706 and previous config saved to /var/cache/conftool/dbconfig/20200922-053346-marostegui.json |
[production] |
05:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2019 (re)pooling @ 75%: Slowly repool after recloning es2034 T261717 ', diff saved to https://phabricator.wikimedia.org/P12705 and previous config saved to /var/cache/conftool/dbconfig/20200922-052951-root.json |
[production] |
05:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2016 (re)pooling @ 75%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12704 and previous config saved to /var/cache/conftool/dbconfig/20200922-052935-root.json |
[production] |
05:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2013 (re)pooling @ 75%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12703 and previous config saved to /var/cache/conftool/dbconfig/20200922-052926-root.json |
[production] |
05:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2019 (re)pooling @ 50%: Slowly repool after recloning es2034 T261717 ', diff saved to https://phabricator.wikimedia.org/P12702 and previous config saved to /var/cache/conftool/dbconfig/20200922-051448-root.json |
[production] |
05:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2016 (re)pooling @ 50%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12701 and previous config saved to /var/cache/conftool/dbconfig/20200922-051431-root.json |
[production] |
05:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2013 (re)pooling @ 50%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12700 and previous config saved to /var/cache/conftool/dbconfig/20200922-051423-root.json |
[production] |
05:00 |
<marostegui> |
Add es2032 es2033 and es2034 to tendril and zarcillo T261717 |
[production] |
04:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2019 (re)pooling @ 25%: Slowly repool after recloning es2034 T261717 ', diff saved to https://phabricator.wikimedia.org/P12699 and previous config saved to /var/cache/conftool/dbconfig/20200922-045944-root.json |
[production] |
04:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2016 (re)pooling @ 25%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12698 and previous config saved to /var/cache/conftool/dbconfig/20200922-045928-root.json |
[production] |
04:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2013 (re)pooling @ 25%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12697 and previous config saved to /var/cache/conftool/dbconfig/20200922-045919-root.json |
[production] |
01:35 |
<ryankemper> |
`sudo cumin C:profile::services_proxy::envoy 'enable-puppet "adding cloudelastic to the service proxy --rkemper"'` done |
[production] |
01:35 |
<ryankemper> |
woot! `curl -X GET -s 'http://localhost:6105/_cluster/health'` gives a response as expected. (As do 6106 and 6107). Re-enabling puppet across the fleet... |
[production] |
01:32 |
<ryankemper> |
`sudo run-puppet-agent -e "adding cloudelastic to the service proxy --rkemper"` on `mwdebug1002.eqiad.wmnet` |
[production] |
01:28 |
<ryankemper> |
`sudo puppet-merge` done, now will run puppet on a single eqiad appserver and verify we can curl `localhost:610{5,6,7}` |
[production] |
01:17 |
<ryankemper> |
Disabling puppet on affected nodes via `sudo cumin C:profile::services_proxy::envoy 'disable-puppet "adding cloudelastic to the service proxy --rkemper"'` |
[production] |
01:17 |
<ryankemper> |
Going to test patch to stick envoy in front of `cloudelastic`, see https://gerrit.wikimedia.org/r/c/operations/puppet/+/628243 |
[production] |
2020-09-21
§
|
23:42 |
<mholloway-shell@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
23:39 |
<mholloway-shell@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
23:37 |
<mholloway-shell@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
23:36 |
<mutante> |
debmonitor2002 - systemctl reset-failed |
[production] |
22:59 |
<mholloway-shell@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
22:57 |
<mholloway-shell@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
22:55 |
<mholloway-shell@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
22:20 |
<mutante> |
releases.wikimedia.org has been converted to an active-active service with geodns/ backends in both DCs |
[production] |
21:56 |
<mholloway-shell@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
21:54 |
<mholloway-shell@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
21:51 |
<mholloway-shell@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
21:28 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
21:23 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |