2020-01-24
§
|
19:29 |
<jeh> |
upgrade cloudcontrol100[34] to the latest python3 openstack clients in stretch |
[openstack] |
17:54 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Clean up CheckUser config (duration: 01m 09s) |
[production] |
15:45 |
<bd808> |
Rebuilding all Docker containers again because I failed to actually update the build server git clone properly last time I did this |
[tools] |
15:43 |
<gehel> |
restart blazegraph + updater on wdqs1007 (seems stuck, known issue) |
[production] |
15:33 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
15:10 |
<jeh> |
remove icinga downtime for cloudvirt1013 T241313 |
[admin] |
14:28 |
<vgutierrez> |
uploaded mtail 3.0.0~rc5-1~bpo9+1wmf2 to apt.wm.o (buster) - T243591 |
[production] |
14:26 |
<akosiaris@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
14:24 |
<akosiaris@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
14:23 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
13:16 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' . |
[production] |
12:52 |
<arturo> |
repooling cloudvirt1013 after HW got fixed (T241313) |
[admin] |
11:09 |
<moritzm> |
purged stale grafana package from grafana1001, caused systemd unit failure |
[production] |
11:04 |
<effie> |
restart php-fpm on mw1238-mw1239 |
[production] |
09:29 |
<akosiaris> |
disable and mask etherpad-lite on etherpad1002 to avoid corruption issues. T224580 |
[production] |
08:42 |
<marostegui> |
Remove wikiadmin2 user from pc2XXX codfw hosts T243512 |
[production] |
08:17 |
<moritzm> |
installing python-apt security updates |
[production] |
07:19 |
<_joe_> |
force run puppet on all esams cache nodes, for mitigation of T243313 |
[production] |
06:37 |
<marostegui> |
Stop replication on db1107 |
[production] |
06:12 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2085 after memory replacement T243148', diff saved to https://phabricator.wikimedia.org/P10256 and previous config saved to /var/cache/conftool/dbconfig/20200124-061228-marostegui.json |
[production] |
05:23 |
<bd808> |
Building 6 new tools-k8s-worker instances for the 2020 Kubernetes cluster (take 2) |
[tools] |
04:41 |
<bd808> |
Rebuilding all Docker images to pick up webservice-python-bootstrap changes |
[tools] |
01:52 |
<thcipriani> |
Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/566911 |
[releng] |
01:45 |
<thcipriani> |
Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/566910 |
[releng] |
01:24 |
<mutante> |
running puppet on cp-text_ulsfo |
[production] |
01:11 |
<Krenair> |
((But maybe should be returning HTTP 405 rather than 404 in this case?)) |
[tools.translation-server] |
01:10 |
<Krenair> |
(I assumed it presented a UI, in fact this API appears to be working now.) |
[tools.translation-server] |
00:58 |
<Krenair> |
Restarted pod per https://en.wikipedia.org/wiki/User_talk:Krenair#Citation_bot - seemed to be stuck, though now it just appears to return its own 404 |
[tools.translation-server] |
00:46 |
<mutante> |
cp4032 - starting varnishmtail.service |
[production] |
00:36 |
<catrope@deploy1001> |
Synchronized php-1.35.0-wmf.16/extensions/CentralNotice/resources/ext.centralNotice.display/hide.js: T240802 (duration: 01m 05s) |
[production] |
00:34 |
<catrope@deploy1001> |
Synchronized php-1.35.0-wmf.15/extensions/CentralNotice/resources/ext.centralNotice.display/hide.js: T240802 (duration: 01m 07s) |
[production] |
00:33 |
<mutante> |
cp4032 - starting varnishmtail.service which was failed |
[production] |
00:32 |
<catrope@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Bump Parsoid/PHP cluster memory_limit again (T239806, T236833) (duration: 01m 05s) |
[production] |
00:08 |
<thcipriani> |
Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/566895 |
[releng] |
00:02 |
<James_F> |
Taking deployment-deploy01 offline to fix beta config update deadlock. |
[releng] |
2020-01-23
§
|
23:43 |
<thcipriani> |
Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/562365 and ttps://gerrit.wikimedia.org/r/562364 |
[releng] |
23:38 |
<bd808> |
Halted tools-k8s-worker build script after first instance (tools-k8s-worker-10) stuck in "scheduling" state for 20 minutes |
[tools] |
23:16 |
<bd808> |
Building 6 new tools-k8s-worker instances for the 2020 Kubernetes cluster |
[tools] |
22:20 |
<bd808> |
Updated to 6eb70b916dceb9fac4ed48a9789dd8d388f1f47d |
[tools.my-first-flask-oauth-tool] |
21:59 |
<James_F> |
Drop REL1_32 testing, jobs, and pipelines, EOLed T242981 |
[releng] |
21:50 |
<halfak> |
deploying ores 039251f (reverting to last good state) |
[releng] |
21:50 |
<halfak> |
deploying ores 039251f (reverting to last good state) |
[deployment-prep] |
21:09 |
<jeh> |
cloudvirt1024 set icinga downtime and powering down for hardware maintenance T241884 |
[openstack] |
21:08 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
20:30 |
<brennen@deploy1001> |
rebuilt and synchronized wikiversions files: Revert "group2 wikis to 1.35.0-wmf.15" |
[production] |
20:29 |
<brennen> |
reverting group2 to 1.35.0-wmf.15 |
[production] |
20:17 |
<jeh> |
cloudvirt1013 set icinga downtime and powering down for hardware maintenance T241313 |
[openstack] |
20:16 |
<jeh> |
cloudvirt1013 set icinga downtime and powering down for hardware maintenance |
[openstack] |
20:10 |
<brennen@deploy1001> |
rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.16 |
[production] |
20:00 |
<Urbanecm> |
Morning SWAT done |
[production] |