2020-06-01
§
|
17:47 |
<XioNoX> |
turn online cr1-codfw:fpc0 - T254110 |
[production] |
17:46 |
<shdubsh> |
update mtail in ulsfo caching hosts. restarting atsmtail and varnishmtail |
[production] |
17:31 |
<mutante> |
backup1001 - queued job 42 - gerrit backup after renaming of the file set and addition of LFS data (T254155, T254162) it is incremental, the full one already ran |
[production] |
16:49 |
<otto@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: EventLogging - fix searchsatisfaction schema URI - testwiki only - T249261 (duration: 00m 59s) |
[production] |
16:48 |
<otto@deploy1001> |
sync-file aborted: EventLogging - fix searchsatisfaction schema URI - testwiki only - T249261 (duration: 00m 02s) |
[production] |
16:39 |
<bstorm_> |
running view updates on db1141 T252219 |
[production] |
14:53 |
<elukey> |
ganeti: increase memory available for an-launcher1001 from 8g to 12g - T254125 |
[production] |
14:44 |
<volans> |
deploying ulsfo mgmt DNS records automatically generated by Netbox ( operations/dns/+/585545/ ) - T233183 |
[production] |
12:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1142, db1147 T252512', diff saved to https://phabricator.wikimedia.org/P11345 and previous config saved to /var/cache/conftool/dbconfig/20200601-120000-marostegui.json |
[production] |
11:44 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1142, db1147 T252512', diff saved to https://phabricator.wikimedia.org/P11344 and previous config saved to /var/cache/conftool/dbconfig/20200601-114440-marostegui.json |
[production] |
11:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1142, db1147 T252512', diff saved to https://phabricator.wikimedia.org/P11343 and previous config saved to /var/cache/conftool/dbconfig/20200601-113032-marostegui.json |
[production] |
10:49 |
<jdrewniak@deploy1001> |
Synchronized portals: Wikimedia Portals Update: [[gerrit:601328| Bumping portals to master (601328)]] (duration: 00m 59s) |
[production] |
10:48 |
<jdrewniak@deploy1001> |
Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:601328| Bumping portals to master (601328)]] (duration: 01m 03s) |
[production] |
09:37 |
<volans@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
09:30 |
<volans@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
09:26 |
<jynus> |
reenabling puppet on all db/es/pc hosts after deploy of gerrit:599596 |
[production] |
09:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1142, db1147 T252512', diff saved to https://phabricator.wikimedia.org/P11342 and previous config saved to /var/cache/conftool/dbconfig/20200601-092220-marostegui.json |
[production] |
09:18 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Add db1147 to dbctl, depooled T252512', diff saved to https://phabricator.wikimedia.org/P11341 and previous config saved to /var/cache/conftool/dbconfig/20200601-091809-marostegui.json |
[production] |
09:06 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
09:05 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
09:05 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
09:05 |
<XioNoX> |
offline cr1-codfw:fpc0 - T254110 |
[production] |
09:05 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
09:04 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
09:03 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:58 |
<godog> |
prometheus eqiad lvextend --resizefs --size +100G vg-ssd/prometheus-ops |
[production] |
08:43 |
<mutante> |
deneb - apt-get remove --purge apt-listchanges (packages was in status "rc" causing DPKG alert, should be removed but config was not purged) |
[production] |
08:41 |
<mutante> |
deneb - apt-get remove python3-debconf (package was in status "ri" causing DPKG icinga alert. ri means it should be removed but is not) |
[production] |
08:33 |
<XioNoX> |
restart cr1-codfw:fpc0 - T254110 |
[production] |
08:22 |
<mutante> |
mw1331 re-enabled puppet (SAL told me about an experiment a little while ago) |
[production] |
08:19 |
<jynus> |
disabling puppet on all db/es/pc hosts for deploy of gerrit:599596 |
[production] |
07:05 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1142 to clone db1147 T252512', diff saved to https://phabricator.wikimedia.org/P11339 and previous config saved to /var/cache/conftool/dbconfig/20200601-070519-marostegui.json |
[production] |
05:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool enwiki db2071 slave to test new index - T238966', diff saved to https://phabricator.wikimedia.org/P11338 and previous config saved to /var/cache/conftool/dbconfig/20200601-050354-marostegui.json |
[production] |
04:54 |
<marostegui> |
Drop testreduce_0715 from m5 master T245408 |
[production] |
04:44 |
<marostegui> |
Depool db1141 from Analytics role - T249188 |
[production] |
2020-05-29
§
|
22:32 |
<bstorm_> |
updated views on labsdb1010 T252219 |
[production] |
20:55 |
<bstorm_> |
updating views on labsdb1011 T252219 |
[production] |
19:27 |
<ryankemper> |
Successfully finished a rolling restart of the `cloudelastic` clusters (chi, psi, omega) as part of elasticsearch plugins upgrade. Host and service checks re-enabled. |
[production] |
17:28 |
<bstorm_> |
updating views on labsdb1009 T252219 |
[production] |
16:50 |
<ryankemper> |
Performing a rolling restart of the `cloudelastic` clusters (chi, psi, omega) as part of elasticsearch plugins upgrade. Host and service checks disabled. |
[production] |
16:00 |
<bstorm_> |
Updating views on labsdb1012 T252219 |
[production] |
15:59 |
<ryankemper> |
Concluded rolling restart of the `relforge` clusters as part of elasticsearch plugins upgrade. Both hosts `relforge1001` and `relforge1002` are back up. Downtime lifted. |
[production] |
15:29 |
<ryankemper> |
Performing a rolling restart of the `relforge` clusters as part of elasticsearch plugins upgrade |
[production] |
14:59 |
<cdanis> |
disabling puppet on netflow* to deploy Ic71e96f0 T253128 |
[production] |
14:47 |
<akosiaris@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
14:47 |
<akosiaris@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . |
[production] |
14:41 |
<akosiaris@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
14:41 |
<akosiaris@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'coredns' . |
[production] |
14:35 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |