2020-05-29
§
|
22:32 |
<bstorm_> |
updated views on labsdb1010 T252219 |
[production] |
20:55 |
<bstorm_> |
updating views on labsdb1011 T252219 |
[production] |
19:27 |
<ryankemper> |
Successfully finished a rolling restart of the `cloudelastic` clusters (chi, psi, omega) as part of elasticsearch plugins upgrade. Host and service checks re-enabled. |
[production] |
17:28 |
<bstorm_> |
updating views on labsdb1009 T252219 |
[production] |
16:50 |
<ryankemper> |
Performing a rolling restart of the `cloudelastic` clusters (chi, psi, omega) as part of elasticsearch plugins upgrade. Host and service checks disabled. |
[production] |
16:00 |
<bstorm_> |
Updating views on labsdb1012 T252219 |
[production] |
15:59 |
<ryankemper> |
Concluded rolling restart of the `relforge` clusters as part of elasticsearch plugins upgrade. Both hosts `relforge1001` and `relforge1002` are back up. Downtime lifted. |
[production] |
15:29 |
<ryankemper> |
Performing a rolling restart of the `relforge` clusters as part of elasticsearch plugins upgrade |
[production] |
14:59 |
<cdanis> |
disabling puppet on netflow* to deploy Ic71e96f0 T253128 |
[production] |
14:47 |
<akosiaris@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
14:47 |
<akosiaris@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . |
[production] |
14:41 |
<akosiaris@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
14:41 |
<akosiaris@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . |
[production] |
14:35 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
14:35 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . |
[production] |
14:27 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:24 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:15 |
<mdholloway> |
ran extensions/MachineVision/maintenance/removeBlacklistedSuggestions.php on commonswiki (T253821) |
[production] |
12:49 |
<hnowlan> |
reimaging restbase2009 after disk replacement |
[production] |
12:37 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
12:35 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
12:15 |
<godog> |
roll-restart to upgrade thanos to 0.13.0rc0 - T252186 T233956 |
[production] |
11:32 |
<moritzm> |
installing cups security updates (client-side libs/tools) |
[production] |
11:01 |
<ema> |
upload prometheus-rdkafka-exporter 0.2 to buster-wikimedia T253551 |
[production] |
10:53 |
<moritzm> |
updating mwdebug2002 to 7.2.31 |
[production] |
10:02 |
<marostegui> |
Compress InnoDB on db1138 T232446 |
[production] |
08:30 |
<godog> |
update swift uid/gid on thanos hosts - T123918 |
[production] |
08:04 |
<mutante> |
phabricator - restarted apache2 - back for me now |
[production] |
08:03 |
<XioNoX> |
add new AMS-IX link to LACP bundle |
[production] |
08:01 |
<mutante> |
phabricator - broken due to "PhabricatorRepositoryMirrorEngine::pushToGitRepository" starting git process that uses 100% CPU, stopped phd service |
[production] |
07:56 |
<mutante> |
phabricator - killed pid 25070 (git) which used 100% of CPU, restarted phd service |
[production] |
07:25 |
<moritzm> |
updating perf on buster systems to new version from 10.4 point release |
[production] |
07:15 |
<moritzm> |
installing el-api update from latest Buster point release |
[production] |
07:12 |
<moritzm> |
installing xdg-utils update from latest Buster point release |
[production] |
07:11 |
<mutante> |
mw1293 (canary jobrunner ) replace apache2.conf with version from mwdebug1001, restart apache, to debug for T190111 |
[production] |
07:00 |
<moritzm> |
installing rake security updates |
[production] |
06:36 |
<mutante> |
deneb - systemctl start docker-reporter-releng-images |
[production] |
05:20 |
<marostegui> |
Deploy schema change on db1138 (no longer s4 master) - T250055 |
[production] |
05:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote db1081 to s4 master and remove read-only from s4 T253808', diff saved to https://phabricator.wikimedia.org/P11334 and previous config saved to /var/cache/conftool/dbconfig/20200529-050224-marostegui.json |
[production] |
05:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set s4 as read-only for maintenance T253808', diff saved to https://phabricator.wikimedia.org/P11333 and previous config saved to /var/cache/conftool/dbconfig/20200529-050153-marostegui.json |
[production] |
05:00 |
<marostegui> |
Starting s4 failover from db1138 to db1081 -T253808 |
[production] |
04:25 |
<marostegui> |
Start topology changes in s4 - T253808 |
[production] |