251-300 of 10000 results (11ms)
2020-09-22 §
07:49 <jayme> running ipvsadm -D -t 10.2.2.18:8080; ipvsadm -D -t 10.2.2.46:3030 on lvs1016.eqiad.wmnet,lvs1015.eqiad.wmnet - T255879 T254581 [production]
07:46 <jayme> restarting pybal on lvs1015.eqiad.wmnet,lvs2009.codfw.wmnet - T255879 T254581 [production]
07:42 <jayme> restarting pybal on lvs1016.eqiad.wmnet,lvs2010.codfw.wmnet - T255879 T254581 [production]
07:39 <jayme> running puppet on lvs servers - T255879 T254581 [production]
07:34 <volans> depooling ulsfo to merge DNS migration to Netbox zonefiles - T258729 [production]
07:24 <marostegui> Stop MySQL on es2014 - host will be decommissioned T262889 [production]
07:14 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove es2014 from dbctl T262889', diff saved to https://phabricator.wikimedia.org/P12711 and previous config saved to /var/cache/conftool/dbconfig/20200922-071435-marostegui.json [production]
07:11 <XioNoX> cr1-codfw# run clear bfd session address fe80::f27c:c7ff:fe11:2c1b [production]
06:18 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es2014 for decommissioning T262889', diff saved to https://phabricator.wikimedia.org/P12710 and previous config saved to /var/cache/conftool/dbconfig/20200922-061815-marostegui.json [production]
05:44 <marostegui@cumin1001> dbctl commit (dc=all): 'es2019 (re)pooling @ 100%: Slowly repool after recloning es2034 T261717 ', diff saved to https://phabricator.wikimedia.org/P12709 and previous config saved to /var/cache/conftool/dbconfig/20200922-054455-root.json [production]
05:44 <marostegui@cumin1001> dbctl commit (dc=all): 'es2016 (re)pooling @ 100%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12708 and previous config saved to /var/cache/conftool/dbconfig/20200922-054438-root.json [production]
05:44 <marostegui@cumin1001> dbctl commit (dc=all): 'es2013 (re)pooling @ 100%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12707 and previous config saved to /var/cache/conftool/dbconfig/20200922-054430-root.json [production]
05:40 <marostegui> Log remove triggers on revision table on db1124:3313 T238966 [production]
05:39 <marostegui> Deploy MCR schema change on s3 eqiad, this will generate lag on s3 on labsdb T238966 [production]
05:33 <marostegui@cumin1001> dbctl commit (dc=all): 'Add es2032, es2033 and es2034 into dbctl T261717', diff saved to https://phabricator.wikimedia.org/P12706 and previous config saved to /var/cache/conftool/dbconfig/20200922-053346-marostegui.json [production]
05:29 <marostegui@cumin1001> dbctl commit (dc=all): 'es2019 (re)pooling @ 75%: Slowly repool after recloning es2034 T261717 ', diff saved to https://phabricator.wikimedia.org/P12705 and previous config saved to /var/cache/conftool/dbconfig/20200922-052951-root.json [production]
05:29 <marostegui@cumin1001> dbctl commit (dc=all): 'es2016 (re)pooling @ 75%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12704 and previous config saved to /var/cache/conftool/dbconfig/20200922-052935-root.json [production]
05:29 <marostegui@cumin1001> dbctl commit (dc=all): 'es2013 (re)pooling @ 75%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12703 and previous config saved to /var/cache/conftool/dbconfig/20200922-052926-root.json [production]
05:14 <marostegui@cumin1001> dbctl commit (dc=all): 'es2019 (re)pooling @ 50%: Slowly repool after recloning es2034 T261717 ', diff saved to https://phabricator.wikimedia.org/P12702 and previous config saved to /var/cache/conftool/dbconfig/20200922-051448-root.json [production]
05:14 <marostegui@cumin1001> dbctl commit (dc=all): 'es2016 (re)pooling @ 50%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12701 and previous config saved to /var/cache/conftool/dbconfig/20200922-051431-root.json [production]
05:14 <marostegui@cumin1001> dbctl commit (dc=all): 'es2013 (re)pooling @ 50%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12700 and previous config saved to /var/cache/conftool/dbconfig/20200922-051423-root.json [production]
05:00 <marostegui> Add es2032 es2033 and es2034 to tendril and zarcillo T261717 [production]
04:59 <marostegui@cumin1001> dbctl commit (dc=all): 'es2019 (re)pooling @ 25%: Slowly repool after recloning es2034 T261717 ', diff saved to https://phabricator.wikimedia.org/P12699 and previous config saved to /var/cache/conftool/dbconfig/20200922-045944-root.json [production]
04:59 <marostegui@cumin1001> dbctl commit (dc=all): 'es2016 (re)pooling @ 25%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12698 and previous config saved to /var/cache/conftool/dbconfig/20200922-045928-root.json [production]
04:59 <marostegui@cumin1001> dbctl commit (dc=all): 'es2013 (re)pooling @ 25%: Slowly repool after recloning es2032 T261717 ', diff saved to https://phabricator.wikimedia.org/P12697 and previous config saved to /var/cache/conftool/dbconfig/20200922-045919-root.json [production]
01:35 <ryankemper> `sudo cumin C:profile::services_proxy::envoy 'enable-puppet "adding cloudelastic to the service proxy --rkemper"'` done [production]
01:35 <ryankemper> woot! `curl -X GET -s 'http://localhost:6105/_cluster/health'` gives a response as expected. (As do 6106 and 6107). Re-enabling puppet across the fleet... [production]
01:32 <ryankemper> `sudo run-puppet-agent -e "adding cloudelastic to the service proxy --rkemper"` on `mwdebug1002.eqiad.wmnet` [production]
01:28 <ryankemper> `sudo puppet-merge` done, now will run puppet on a single eqiad appserver and verify we can curl `localhost:610{5,6,7}` [production]
01:17 <ryankemper> Disabling puppet on affected nodes via `sudo cumin C:profile::services_proxy::envoy 'disable-puppet "adding cloudelastic to the service proxy --rkemper"'` [production]
01:17 <ryankemper> Going to test patch to stick envoy in front of `cloudelastic`, see https://gerrit.wikimedia.org/r/c/operations/puppet/+/628243 [production]
2020-09-21 §
23:42 <mholloway-shell@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
23:39 <mholloway-shell@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
23:37 <mholloway-shell@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
23:36 <mutante> debmonitor2002 - systemctl reset-failed [production]
22:59 <mholloway-shell@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
22:57 <mholloway-shell@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
22:55 <mholloway-shell@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
22:20 <mutante> releases.wikimedia.org has been converted to an active-active service with geodns/ backends in both DCs [production]
21:56 <mholloway-shell@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
21:54 <mholloway-shell@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
21:51 <mholloway-shell@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
21:28 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
21:23 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission [production]
21:18 <pt1979@cumin2001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
21:12 <pt1979@cumin2001> START - Cookbook sre.dns.netbox [production]
20:49 <ebernhardson@deploy1001> Synchronized wmf-config/InitialiseSettings.php: adjust enwiktionary completion search ranking (duration: 00m 57s) [production]
20:47 <ebernhardson@deploy1001> Synchronized php-1.36.0-wmf.9/extensions/CirrusSearch/: Remove pages from completion search by page id (duration: 01m 00s) [production]
20:04 <herron> moving prometheus instance from bast3004 to prometheus3001 T243057 [production]
19:46 <herron> moving prometheus instance from bast4002 to prometheus4001 T243057 [production]