2019-12-16
§
|
16:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Change weights from 1 to 100 on x1 slaves in eqiad and codfw - T231018', diff saved to https://phabricator.wikimedia.org/P9880 and previous config saved to /var/cache/conftool/dbconfig/20191216-160346-marostegui.json |
[production] |
15:41 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) |
[production] |
15:28 |
<mforns@deploy1001> |
Finished deploy [analytics/refinery@1c72a71]: deploying analytics refinery for kerberos migration (duration: 07m 57s) |
[production] |
15:20 |
<mforns@deploy1001> |
Started deploy [analytics/refinery@1c72a71]: deploying analytics refinery for kerberos migration |
[production] |
15:15 |
<elukey@cumin1001> |
START - Cookbook sre.druid.roll-restart-workers |
[production] |
14:58 |
<cdanis> |
✔️ cdanis@mwdebug2001.codfw.wmnet ~ 🕤☕ scap pull |
[production] |
14:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1084 schema change', diff saved to https://phabricator.wikimedia.org/P9877 and previous config saved to /var/cache/conftool/dbconfig/20191216-145520-marostegui.json |
[production] |
14:49 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1121 after schema change', diff saved to https://phabricator.wikimedia.org/P9876 and previous config saved to /var/cache/conftool/dbconfig/20191216-144902-marostegui.json |
[production] |
14:46 |
<cdanis@deploy1001> |
Synchronized wmf-config/db-eqiad.php: db-eqiad: remove dbctl-obsoleted externalLoads section 5413a6d73 T229686 (duration: 00m 54s) |
[production] |
14:45 |
<cdanis@deploy1001> |
Synchronized wmf-config/db-codfw.php: db-codfw: remove dbctl-obsoleted externalLoads section 519e37461 T229686 (duration: 00m 54s) |
[production] |
14:39 |
<oblivian@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' . |
[production] |
14:39 |
<cdanis@deploy1001> |
Synchronized wmf-config/etcd.php: db-codfw: remove dbctl-obsoleted externalLoads section 519e37461 T229686 (duration: 00m 53s) |
[production] |
14:38 |
<oblivian@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' . |
[production] |
14:36 |
<oblivian@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' . |
[production] |
14:35 |
<XioNoX> |
delete virtual chassis ID on asw-a-codfw |
[production] |
14:34 |
<XioNoX> |
delete virtual chassis ID on asw-b-codfw |
[production] |
14:32 |
<XioNoX> |
delete virtual chassis ID on asw-c-codfw |
[production] |
14:30 |
<cdanis> |
manual testing of I219711eb on mwdebug2001 |
[production] |
14:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1127 after testing', diff saved to https://phabricator.wikimedia.org/P9875 and previous config saved to /var/cache/conftool/dbconfig/20191216-141141-marostegui.json |
[production] |
14:09 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1127 from x1 for testing', diff saved to https://phabricator.wikimedia.org/P9874 and previous config saved to /var/cache/conftool/dbconfig/20191216-140951-marostegui.json |
[production] |
14:03 |
<cdanis@deploy1001> |
Synchronized wmf-config/etcd.php: enable dbctl for externalLoads 6dfb30c76 T229686 (duration: 00m 53s) |
[production] |
13:50 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
13:50 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:33 |
<ema> |
cp-ats: rolling ats-backend-restart to apply ram cache size changes T238494 |
[production] |
13:33 |
<moritzm> |
restarting systemd-timesyncd on stat1005 |
[production] |
12:52 |
<elukey> |
shutdown of the Analytics Hadoop cluster to enable Kerberos |
[production] |
12:16 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
12:15 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
12:12 |
<Urbanecm> |
EU SWAT done |
[production] |
12:11 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: 026913d: Add no=>nb in $wgInterlanguageLinkCodeMap (T174160) (duration: 00m 53s) |
[production] |
11:58 |
<jynus@cumin1001> |
dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P9873 and previous config saved to /var/cache/conftool/dbconfig/20191216-115841-jynus.json |
[production] |
11:55 |
<hashar> |
Restarting Jenkins completely to flush out stall Gearman functions in Zuul |
[production] |
11:41 |
<jdrewniak@deploy1001> |
Synchronized portals: Wikimedia Portals Update: [[gerrit:558017| Bumping portals to master (T128546)]] (duration: 00m 52s) |
[production] |
11:40 |
<jdrewniak@deploy1001> |
Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:558017| Bumping portals to master (T128546)]] (duration: 00m 56s) |
[production] |
10:57 |
<elukey> |
disable puppet on labstore100[6,7] and stop analytics-related systemd timers - prep step for Kerberos |
[production] |
10:41 |
<XioNoX> |
delete virtual chassis ID on asw-d-codfw |
[production] |
10:14 |
<hashar> |
Restarting CI Jenkins due to out of sync state between Zuul Gearman and what is actually running (some jobs got lost) |
[production] |
09:50 |
<marostegui> |
Stop replication in the same position in labsdb1010 and labsdb1012 - T238399 |
[production] |
09:24 |
<hashar> |
Reloading Jenkins CI |
[production] |
09:14 |
<godog> |
upgrade hw raid firmware on ms-be2016 and reboot - T240798 |
[production] |
09:14 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
09:13 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
09:04 |
<Urbanecm> |
mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Coffeeandcrumbs /home/urbanecm/T240825 (T240825) |
[production] |
08:54 |
<ema> |
cp1077: ats-backend-restart to increase RAM cache size T238494 |
[production] |
08:53 |
<moritzm> |
powercycling ms-be2016 T240798 |
[production] |
08:36 |
<ema> |
cp1075: repool all services T240826 |
[production] |
08:12 |
<ema> |
cp1075: wipe varnish-fe and ats-be caches due to missed purges T240826 |
[production] |
08:08 |
<ema> |
cp1075: manually start vhtcpd.service T240826 |
[production] |
07:52 |
<ema> |
cp1075: depool, vhtcpd not running |
[production] |
07:38 |
<marostegui> |
Disable auto-learn on db21[03-35] T240823 |
[production] |