2020-06-01
§
|
09:18 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Add db1147 to dbctl, depooled T252512', diff saved to https://phabricator.wikimedia.org/P11341 and previous config saved to /var/cache/conftool/dbconfig/20200601-091809-marostegui.json |
[production] |
09:06 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
09:05 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
09:05 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
09:05 |
<XioNoX> |
offline cr1-codfw:fpc0 - T254110 |
[production] |
09:05 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
09:04 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
09:03 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
09:01 |
<RhinosF1> |
deleted sopel.bot deployment and stopped webservice - START T254046 |
[tools.zppixbot] |
08:58 |
<godog> |
prometheus eqiad lvextend --resizefs --size +100G vg-ssd/prometheus-ops |
[production] |
08:43 |
<mutante> |
deneb - apt-get remove --purge apt-listchanges (packages was in status "rc" causing DPKG alert, should be removed but config was not purged) |
[production] |
08:41 |
<mutante> |
deneb - apt-get remove python3-debconf (package was in status "ri" causing DPKG icinga alert. ri means it should be removed but is not) |
[production] |
08:33 |
<XioNoX> |
restart cr1-codfw:fpc0 - T254110 |
[production] |
08:22 |
<mutante> |
mw1331 re-enabled puppet (SAL told me about an experiment a little while ago) |
[production] |
08:19 |
<jynus> |
disabling puppet on all db/es/pc hosts for deploy of gerrit:599596 |
[production] |
08:17 |
<RhinosF1> |
upload starter-new.sh and switched sopelbot.yaml foor T254046 |
[tools.zppixbot] |
07:46 |
<RF1dle> |
add notice for T254046 to wiki index about |
[tools.zppixbot] |
07:05 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1142 to clone db1147 T252512', diff saved to https://phabricator.wikimedia.org/P11339 and previous config saved to /var/cache/conftool/dbconfig/20200601-070519-marostegui.json |
[production] |
06:53 |
<elukey> |
re-run virtualpageview-hourly-wf-2020-5-31-19 |
[analytics] |
06:28 |
<elukey> |
temporary stop of all RU jobs on an-launcher1001 to priviledge camus and others |
[analytics] |
06:03 |
<elukey> |
kill all airflow-related processes on an-launcher1001 - host killing tasks due to OOM |
[analytics] |
05:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool enwiki db2071 slave to test new index - T238966', diff saved to https://phabricator.wikimedia.org/P11338 and previous config saved to /var/cache/conftool/dbconfig/20200601-050354-marostegui.json |
[production] |
04:54 |
<marostegui> |
Drop testreduce_0715 from m5 master T245408 |
[production] |
04:44 |
<marostegui> |
Depool db1141 from Analytics role - T249188 |
[production] |
00:39 |
<bd808> |
Ugh. Prior SAL message was about tools-sgeexec-0940 |
[tools] |
00:39 |
<bd808> |
Compressed /var/log/account/pacct.0 ahead of rotation schedule to free some space on the root partition |
[tools] |
00:31 |
<bd808> |
Also, why is tools.squirrelnestbot running a job for tools.unblockbot? |
[tools.squirrelnestbot] |
00:31 |
<bd808> |
Stopped grid job running tools.unblockbot/unblockbot.sh. Script is in an infinite crash loop because it does not handle https properly. |
[tools.squirrelnestbot] |
2020-05-30
§
|
21:52 |
<RhinosF1> |
Maint Complete! |
[tools.zppixbot-test] |
21:49 |
<wm-bot> |
<rhinosf1> chmod a+x starter-new.sh |
[tools.zppixbot-test] |
21:26 |
<RhinosF1> |
tools.zppixbot-test@tools-sgebastion-07:~/k8s$ take /data/project/zppixbot-test/k8s/starter-new.sh - I hate forklift's file handling at times |
[tools.zppixbot-test] |
21:14 |
<RhinosF1> |
tools.zppixbot-test tools.zppixbot-test@tools-sgebastion-07:~/.sopel$ kubectl scale --replicas=1 deployment.apps/sopeltest.bot |
[tools.zppixbot-test] |
21:01 |
<RhinosF1> |
rename starter.sh to starter-old and create starter-new - move zppixbot-test to use it for the deployment |
[tools.zppixbot-test] |
21:00 |
<RhinosF1> |
switch that to sopeltest.bot |
[tools.zppixbot-test] |
20:59 |
<RhinosF1> |
tools.zppixbot-test@tools-sgebastion-07:~/.sopel$ kubectl scale --replicas=0 deployment.apps/zppixbot-test |
[tools.zppixbot-test] |
16:53 |
<Reedy> |
Reloading Zuul to deploy https://gerrit.wikimedia.org/r/599963 |
[releng] |
14:30 |
<wm-bot> |
<lucaswerkmeister> deployed ff77d74e3f (add somevalue depicts statements) |
[tools.wd-image-positions] |
13:18 |
<Amir1> |
ladsgroup@deployment-deploy01:/srv/mediawiki/php-master$ mwscript maintenance/createAndPromote.php --wiki=fawiki --bureaucrat --force --interface-admin --sysop Ladsgroup (Part of T253291) |
[releng] |
08:15 |
<elukey> |
manual reset-failed of monitor_refine_mediawiki_job_events_failure_flags |
[analytics] |